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ABSTRACT 

The  sequential  solution,  in  recursive  form,  of  a 
growing  set  of  linear  equations,  based  upon  the  least- 
square-error  and  a  weighted  least-square-error  criterion, 
are  developed.   For  comparison  these  results  are  applied 
to  the  discrete-time  solution  of  several  estimation  and 
identification  problems.   Recursive  algorithms  for  pseudo 
inversion  and  the  best  approximate  solution  of  a  set  of 
linear  equations  are  included.   Finally,  efficient  state 
estimation  procedures  for  time-invariant  systems  using  a 
sliding-window  observer  are  presented. 
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I.   INTRODUCTION 

In  a  broad  sense  this  dissertation  is  concerned  with 
the  basic  problem  of  solving  the  linear  matrix  equation* 

Ax  =  b  (1.1) 

where  A  is  an  m  x  n  matrix,  x  an  n  x  1  vector  of  unknowns, 
and  b  an  m  x  1  vector  of  constants.   The  solution,  if  A  has 
rank  r  =  n,  is  straightforward  and  reduces  essentially  to 
matrix  inversion 

x  =  [ATA]   ATb  (1.2) 

T  ip   -1 

where  A  is  the  transpose  of  A,  [A  A]    is  the  inverse  of 

A  A,  and  where  x  denotes  the  solution  of  (1.1).   However, 

when  the  rank  of  A  is  less  than  n,  the  set  of  equations 

(1.1)  is  underdetermined  and  infinitely  many  solutions 

exist.   In  order  to  select  a  unique  solution  out  of  all 

possible  solutions  further  constraints  are  imposed.   In 

the  work  presented  here  the  minimum-norm  solution,  as  de- 

fined  by  Penrose  [5],  is  selected.   If  x  denotes  the 

selected  solution,  and  x   any  other  possible  solution,  then 

ii   ii  T 

where   x   =  trace  xx  .   The  solution  of  (1.1)  is  further 


* 

Throughout  this  dissertation  a  bar  under  a  lower  case 

letter  represents  a  column  matrix  or  vector.   Capital 
letters  generally  refer  to  matrices. 


complicated  when  the  set  of  equations  is  inconsistent  so 
that  there  is  no  solution  which  satisfies  all  equations. 
In  this  case  a  compromise  solution  has  to  be  accepted 
such  that  all  equations  are  satisfied  as  close  as  possible 
according  to  some  criterion.  Usually  this  compromise 
solution  is  selected  by  minimizing  an  error  criterion  J. 

J  =  f  (e)  (1.4) 

where  e  =  Ax  -  b.   The  most  commonly  used  criterion  is  the 
least-square-error  sum 

J  =  e  e  =  trace  ee   =  E   e,  (1.5) 

.      3. 

where  e .  is  the  i    element  of  the  vector  e. 

When  this  criterion  is  combined  with  the  minimum-norm  con- 
dition (1.3)  a  unique  general  solution  results  which 
Penrose  [5]  has  defined  as  the  best  approximate  solution  and 
which  is  obtained  using  the  concept  of  the  pseudo  inverse. 
This  general  approach  to  the  solution  of  (1.1)  is  par- 
ticularly useful  when  applied  to  discrete-time  system 
problems  such  as  state  estimation,  parameter  identification , 
and  the  limited  memory  observer  problem  as  discussed  in  the 
body  of  this  dissertation.   As  an  example,  consider  the 
problem  of  estimating  the  states  of  a  linear,  dynamic  system 
from  noise-contaminated  measurements.   The  dynamic  system 
and  the  measurements  are  given  by 


*k  =  Vk-A-i  (1-6a) 


zk  =  Mk  *k  +  \ 


(1.6b) 


where  x,  is  the  system  state  vector,  $,  ,  _,  the  discrete, 
time-varying  transition  matrix,  M,  a  time-varying  observation 
matrix  of  dimensions  1  xn,  v,  the  measurement  noise,  and  z, 
a  scalar  observation.   After  k  observations  the  data  may  be 
arranged  as  follows 


M  § 
ulvl,k 


M  $ 
2  2,k 


M, 


^k 


(1.7) 


Thus  the  state  estimation  problem  for  linear/  discrete-time 
systems  is  reduced  to  the  problem  of  solving  (1.1).   When 
the  transition  matrix  is  the  identity  matrix  this  problem 
reduces  to  estimating  a  constant  but  unknown  vector  x.   The 
estimation  problem  has  been  considered  for  many  years  by 
such  famous  mathematicians  as  Gauss  [1],  Penrose  [4,5], 
Kalman  [2],  Deutsch  [1]  and  many  others.   In  spite  of  their 
differing  approaches  the  underlying  concept  remains  the 
solution  of  (1.1). 

The  problem  of  identifying  the  coefficients  of  the 
recursive  difference  equation  describing  a  time-invariant, 
linear  system  from  a  sequence  of  noisy  response  measure- 
ments can  also  be  reduced  to  the  solution  of  (1.1).   Pre- 
vious investigators  have  solved  this  identification  problem 


using  different  methods  such  as  correlation  functions, 
deconvolution  techniques,  adaptive  model  techniques,  and 
others  as  discussed  by  Mishkin  and  Braun  [17] ,  Eveleigh 
[18],  Eykkoff  [19]  and  many  others.   The  approach  using 
the  problem  formulation  of  (1.1)  has  been  considered  by 
R.  C.  K.  Lee  [3] . 

Thus,  it  has  been  well  established  that  it  is  possible 
to  solve  discrete  estimation  or  identification  problems 
using  the  concept  of  the  best  approximate  solution  of  (1.1). 
For  real-time  computation,  as  required  for  example  in  some 
self-adaptive  control  systems,  it  is  desired  to  obtain 
numerical  results  sequentially  with  a  minimum  of  computation 
time  and  data  storage.   Since  the  dimension,  m,  of  matrix  A 
grows  at  each  step  in  time  as  additional  data  is  acquired, 
it  is  desirable  to  formulate  a  sequential  solution  to  (1.1) 
in  recursive  form.   Almost  all  known  algorithms  are  based 
upon  the  assumption  that  matrix  A  has  rank  r  =  n  whenever 
m  >  n,  which  might  not  be  true  in  general.   The  algorithm 
developed  in  this  dissertation  solves  (1.1)  sequentially 
for  the  best  approximate  solution  [5]  without  any  assump- 
tions as  to  the  rank  of  A.   The  normalized  least-square- 
error  solution  and  an  algorithm,  an  alternate  formulation 
based  upon  a  different  error  criterion,  are  developed  and 
applied  to  an  estimation  and  an  identification  problem. 
For  completeness,  recursive  non-sequential  forms  for 
obtaining  the  Penrose  inverse  and  the  best  approximate 
solution  of  (1.1),  when  the  matrix  A  has  constant  dimensions, 


are  included.   Finally,  efficient  procedures  for  sequential 
state  estimation  of  time-invariant  systems  are  presented, 
where  the  state  estimation  is  obtained  from  a  finite  but 
continuously  updated  set  of  observations  (sliding-window 
observer) .   These  results  fall  into  the  category  of  linear 
observer  theory  as  discussed  by  Luenberger  [11]  and  Bona 
[12]. 

The  development  and  discussion  of  the  foregoing  results, 
including  geometric  interpretation  and  examples,  are  pre- 
sented as  follows.   In  Chapter  II,  Eqs .  (1.1)  are  considered 
to  be  a  set  of  overdetermined  equations.   The  closed  form 
solution,  as  well  as  the  recursive  solution  (Kalman  type 
filter) ,  are  well  known.   However,  a  geometric  interpre- 
tation of  the  known  results  suggests  an  alternate  way  of 
selecting  the  compromise  solution  using  a  weighted  least- 
square-error  criterion.   This  solution  is  defined  here  as 
the  normalized  least-square-error  solution.   The  normalized 
least-square-error  solution  in  recursive  form,  which  requires 
only  a  slight  modification  of  the  least-square-error 
algorithm,  is  applied  to  a  specific  estimation  problem  as 
well  as  to  an  identification  problem.   The  latter  consists 
of  identifying  the  coefficients  of  a  difference  equation 
describing  a  dynamic,  linear,  time-invariant  system  from 
system  response  data.   The  experimental  results  are  quite 
favorable  for  the  normalized  least-square-error  solutions. 
For  the  identification  problem  it  is  shown  that  with 
sinusoidal  excitation  the  normalized  least-square-error 


solution  results  in  a  smaller  bias  error.   However,  these 
results  cannot  be  generalized  and  whether  the  normalized 
least-square-error  algorithm  should  be  used  depends 
entirely  upon  the  specific  problem  under  consideration. 

In  Chapter  III  the  general  solution  of  (1.1)  when  the 
rank  of  A  is  less  than  or  equal  to  n  is  discussed.   First 
the  definition  and  some  properties  of  the  pseudo  inverse 
and  the  best  approximate  solution  according  to  Penrose 
[4,5]  are  stated  and  some  alternate  expressions  for  the 
pseudo  inverse  are  discussed.   Then  a  complete  recursive 
algorithm  for  the  best  approximate  solution  for  the  general 
case  is  developed  for  use  as  a  sequential-estimation  pro- 
cedure.  The  algorithm  presented  here  has  the  advantage 
over  previously  published  results  in  that  the  dimensions  of 
the  matrices  involved  in  the  algorithm  remain  constant 
irrespective  of  the  dimension,  m,  and  rank  of  A.   It  is 
interesting  to  note  that  the  form  of  the  resulting  filter 
equation 

*k  =  £k_i  +  a  («k  -  aT  x^)  (1.4) 

evolves  naturally  by  solving  the  necessary  equations  and 
is  not  assumed  a  priori.   Finally,  the  algorithm  is  adapted 
for  state  estimation  of  linear,  time-varying  dynamic 
systems . 

The  solution  of  a  fixed  set  of  equations  (1.1)  is  con- 
sidered in  Chapter  IV.   Although  it  is  possible  to  obtain 
the  solution  with  the  algorithm  of  Chapter  III,  this  method 
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is  not  very  efficient  because  only  the  final  solution,  with- 
out intermediate  sequential  estimates,  is  required.   More 
efficient  methods  are  obtained  here  by  combining  an  infinite 
iterative  error  correcting  method,  as  developed  by  Noda  and 
Nagumo  [10],  and  the  Gram-Schmidt  orthogonalization  process 
[9].   The  results  are  finite-step  algorithms  for  the  solu- 
tion of  matrix  equation  (1.1),  matrix  inversion  (when  the 
matrix  is  nonsingular) ,  and  matrix  pseudo  inversion  for  the 
general  case. 

In  Chapter  V  the  sequential  solution  of  a  growing  set 
of  equations  (1.1)  is  considered  again.   However  it  is 
assumed  a  priori  that  the  set  of  equations  is  consistent 
and  that  any  subsequent  sets  of  n  equations  in  (1.1)  are 
independent.   The  problem  is  then  solved  with  the  ultimate 
goal  of  developing  finite-memory,  sliding-window  observers. 
First,  the  general  case  of  time-varying  linear  systems 
is  considered  and  a  new  algorithm  for  the  minimum-window 
observer  (an  observer  with  a  memory  limited  to  exactly 
the  minimal  set  of  data)  is  developed.   However  the  high 
sensitivity  of  this  observer  to  measurement  noise  severely 
limits  its  use  [12].   Introducing  the  further  constraints 
that  the  system  and  the  observation  matrix  are  time  invariant 
leads  to  more  useful  and  very  efficient  results.   It  is 
shown  that  with  these  constraints  it  is  possible  to  construct 
simple  and  efficient  filters  for  state  estimation  from  noise- 
contaminated  measurements  using  the  results  for  the  minimum- 
window  observer.   Also,  using  the  concept  of  the  pseudo 
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inverse,  the  memory  of  the  observer  may  be  extended  so  that 
a  sliding-window  observer  of  arbitrary  length  i  >  n  is 
obtained.   In  this  case  the  algorithm  for  state  estimation 
reduces  to 

^k  =  F*k-1  +  2  zk  d'5> 

where  F  and  g  are  constant.   The  performances  of  these 
filters  when  processing  noisy  measurements  is  illustrated 
with  an  example. 

Finally,  in  Appendix  A,  the  solution  of  a  set  of  non- 
linear equations  using  the  results  of  Chapter  IV  is 
attempted.   An  iteration  scheme  is  presented  in  which  the 
value  of  the  total  difference  quotient  for  the  nonlinear 
part  of  the  set  of  equations  is  iterated  sequentially  to 
its  true  solution.   Computational  results  are  presented  for 
examples  where  this  iteration  process  converges  while  other 
commonly  used  methods,  such  as  Newton-Raphson,  Gradient, 
and  Linear  Interpolation  fail.   This  iteration  scheme  is 
then  proposed  for  the  solution  of  sets  of  nonlinear 
differential  equation  for  networks  containing  nonlinear, 
memoryless,  dissipative  elements. 
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II.   THE  NORMALIZED  LEAST-SQUARE-ERROR  SOLUTION 

In  this  chapter  the  solution  of  a  set  of  overdetermined 
simultaneous  linear  equations  is  considered  first,  using 
the  concepts  of  the  pseudo  inverse  as  developed  byPenrose 
[4,5],   The  usual  least-square-error  solution  is  presented 
and  interpreted  geometrically.   It  is  then  shown  that  an 
alternate  solution,  which  has  been  designated  here  as  the 
normalized  least-square-error  solution,  is  also  possible 
and  leads  to  a  different  geometrical  interpretation.   These 
results  are  then  applied  to  an  estimation  problem,  and  a 
recursive  algorithm,  based  upon  the  normalized  least-square- 
error  solution,  is  developed.   The  resulting  equations  are 
similar  in  form  to  the  Kalman  [2]  type  of  discrete  filter 
as  discussed  by  R.  C.  K.  Lee  [3],  but  differ  substantially 
in  their  precise  formulation  and  the  nature  of  the  results. 
A  numerical-estimation  example  comparing  the  least-square- 
error  filter  with  the  normalized  least-square-error  filter 
is  presented.   The  recursive  equations  are  then  applied 
to  the  problem  of  identifying  the  coefficients  of  the 
difference  equation  describing  a  dynamic  system  from  a 
sequence  of  noisy  measurements  of  the  system's  response. 
The  results  of  a  numerical  example  are  presented  and  com- 
pared with  the  results  obtained  using  the  usual  least- 
square-error  filter.   These  results  indicate  that  the  error 
in  coefficient  identification  may  be  less  for  the  normalized 
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least-square-error  solution,  as  defined,  than  for  the  usual 
least-square-error  solution  available  in  the  literature. 

An  alternate  approach  to  the  solution  of  the  identifi- 
cation problem  is  to  use  estimates  for  the  past  system 
response  rather  than  the  past  observations  themselves  in  the 
problem  formulation.   As  shown  in  Example  2.3  this  procedure 
may  result  in  a  better  identification  of  the  system  co- 
efficients.  Finally,  it  is  demonstrated  that  in  the 
presence  of  measurement  noise  a  bias  error  in  the  identifi- 
cation begins  to  build  whenever  the  input  function  is 
constant.   However,  when  the  input  function  is  causing  a 
significant  dynamic  system  response,  the  estimation  error 
approaches  a  constant  bias.   The  analysis  of  bias  error  is 
performed  by  approximating  the  discrete  formulations  in 
the  continuous  time  domain  so  that  a  limiting  process  can 
be  performed  readily.   These  results  are  verified  experi- 
mentally. 

A.   LEAST-SQUARE-ERROR  AND  NORMALIZED  LEAST-SQUARE-ERROR 
SOLUTION 

The  most  common  method  of  solving  a  set  of  m  simul- 
taneous equations  in  n  unknowns,  where  m  >  n,  is  the  least- 
square-error  solution.   The  problem  consists  of  solving 
the  algebraic  relationships 

b  =  Ax  (2.1) 

where      A   is  the  m  x  n  matrix  of  coefficients 
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b   is  the  m  x  1  vector  of  constants 
x   is  the  n  x  1  vector  of  unknowns. 

A  solution  for  x  is  the  best  approximate  solution  introduced 
by  Penrose  [5] . 

x  =  A+b  (2.2) 

where  A   is  the  Penrose  pseudo-inverse,  and  x  is  the  best 
approximate  solution.   If  the  matrix  A  has  rank  n ,  this 
solution  is  obtained  as* 

x  =    [ATA]       ATb  (2.3) 

rji  rp  -1 

where  a  denotes  the  transpose  of  A  and  [A  A]    the  inverse 

T  ^ 

of  [A  A] .   In  this  case  the  solution  x  minimizes  the  cost 

function  J 

2        9     m    9 

J  =  ||  Ax  -  b||   =  |je||   =   I   ef  (2.4) 

i=l   1 
where   e  =  Ax  -  b  (2.5) 

and  e .  represents  the  components  of  the  vector  e_.   Thus 
this  solution  satisfies  all  the  equations  of  (2.1)  as  close 
as  possible  in  the  least-square-error  sense  which  is  shown 
as  follows. 


* 
If  A  has  rank  less  than  n  (i.e.,  the  rows  of  A  are  not 

independent)  the  pseudo  inverse  has  a  different  form  as 

discussed  in  Chapter  III. 
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The  minimum  of  Eq.  (1.4)  occurs  when 

—  =  2[Ax  -  b]T  A  =  0  (2.6) 

3x 

or     ATAx  =  ATb  (2.7) 


Therefore 


x  =  [ATA]    ATb  (2.8) 


A  geometric  interpretation  of  the  least-square-error 
solution  may  be  obtained  by  considering  the  two  dimensional 
case  where  the  vector  x  has  two  components  x, ,  x~.   Each 
equation  of  (2.1)  represents  a  line  in  the  x, ,  x2  plane. 
These  lines  generally  do  not  intersect  at  a  single  point. 
Now  consider  the  sequence  of  lines  given  by 

Ax  =  b  -  e  (2.9a) 

where  e  is  a  vector  with  arbitrary  elements,  e..  These 
lines  are  parallel  to  the  original  lines  but  shifted  in 
the  orthogonal  direction  by  the  amount,  s. ,  where 

2      2  "^ 
si  =  ^ali  +  a2i^    ei  '   i  =  1'2'---m  (2.9b) 

and  a. .  are  elements  of  the  matrix  A.   In  the  least-square- 
error  solution  the  lines  are  shifted  so  that  they  all  pass 
through  the  point  x, ,  x„  so  that  the  cost  function,  J 

m    ~ 
J=  I      eZ  (2.9c) 

i-1  X 

is  minimized. 
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A  different  result  may  be  obtained  by  considering  each 
equation  of  (2.1)  as  a  permissible  geometric  locus  for  the 
desired  solution  point.   In  general  this  locus  is  a  hyper- 
plane  in  n  dimensional  space,  where  n  is  the  number  of 
elements  in  the  unknown  vector  x.   If  all  the  loci  intersect 
in  a  single  point  this  solution  corresponds  to  that  of 
(2.3).   If  they  do  not  intersect  in  a  point,  the  solution 
point  may  be  defined  as  the  point  which  minimizes  the  sum 
of  the  distances  squared  to  each  locus.   This  solution  is 
defined  here  as  the  normalized  least-square-error  solution. 
Its  interpretation  is  quite  different  from  the  least-square- 
error  solution  and  in  certain  types  of  identification  prob- 
lems is  seen  to  be  more  meaningful  than  the  usual  results 
available  in  the  literature. 

The  normalized  least-square-error  solution  is  a  weighted 
least-square-error  solution  with  weighting  factors  chosen 
such  that  the  solution  point  lies  as  close  as  possible  to 
all  geometric  loci  described  by  (2.1).   This  result  can  be 
obtained  by  selecting  the  solution  x*  such  that  the  scalar 
cost  function 

m   p 
J*  =  t      d  (2.10a) 

i=l   L 

is  minimized,  where  the  d. 's  are  the  distances  from  x*  to 

i  — 

the  respective  loci  designated  by  the  subscript  i.   Before 
deriving  this  solution  it  is  desirable  to  prove  the 
following. 
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The  distance  from  a  point 

P* (x  *,x  *  ...  x  *)  to  the  plane 
12        n  c 

0  =  a-,x,  +  a~x0  +  ...  +  a  x  -  c  (2.10b) 

11     2  2  n  n 

is  given  by 

-is 
2     2         2 
|d|  =  | (a,  +  a„  +...+  a J   (a,x  *  +  a„x  *+. . . +a  x  *  -  c) 
1  '    '  ,  1     2         n      11      2    2  nn 

(2.11) 

Proof  of  Eq,  (2.11)  is  given  for  the  three  dimensional  case. 

Extension  to  higher  dimensions  is  obvious,, 

Proof 

Let  the  point  P*(x*,y*,z*)  and  the  plane  c  =  a-,  x+a-y+a^z  be 
given.   The  unit  vector  normal  to  the  plane  is  found  by 
considering  a  plane  through  the  origin  parallel  to  the 
given  plane.   Its  equation  is 

0  =  a,x  +  a„y  +  a^z  (2.12a) 

which  is  the  dot  product  of  two  vectors,  one  the  position 
vector  r  to  any  point  in  the  plane 

r  =  xi.  +  y£  +  z4  (2.12b) 

and  the  other  a  vector  normal  to  the  plane 

n  =  a1l+  a2£  +  a3£  (2.12c) 

where  x_,    fi_,    and  jb   are  unit  vectors  defining  the  three- 
dimensional  cartesian  space.   The  unit  normal  vector  is  then 
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71=  a  i  +  a2j_  +  013k  (2.12d) 

where 

2     2     2 
a.  =  (a,  +  a  +  a~)    a.   ,   i  =  1,2,3 

The  distance  |d|  is  then  given  by  (See  Fig.  2.1,  where  N  is 
the  point  in  the  given  plane  closest  to  the  given  point  P* 
and  NP*  denotes  the  vector  from  N  to  P*) 

|d|  =  I (r  -  r*)  •  TO  |  •   (2.12f) 

where  r  is  the  position  vector  of  a  point  in  the  given  plane  and 

r*  is  the  vector  from  the  origin  to  the  point  P*. 

Since  2     -     2  "^ 

r«7i  =  a,x  +  a„y  +  a_z  =  (a,  +  a_  +  a_)   c     (2.12g) 


and 


-3* 
r*.?l=  (a^  +  a22    +   a^)   (a±x*   +  a2y*  +  a3z*)   (2.12h) 

it  follows  that 

ii.222^  , 

|d|  ^  I  (a1  +  a2  +  a-)   (a,x*  +  a2y*  +  a3z*  -  c)  |      (2.12i) 

Extension  to  higher  dimensions  follows  the  same  arguments 
as  above  and  leads  to  the  general  result  of  (2.11). 

The  solution  to  the  equations  (2.1)  which  minimizes  the  cost 
function  (2.9)  may  be  obtained  by  considering  a  slight 
modification  of  (2.3).   Consider  the  vector  d  with  elements 
d.  given  by 

d  =  WAx  -  Wb  (2.13a) 
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I  d  I  -   I  NP*I 


FIG.  2.1    DIAGRAM  FOR   PROOF   OF£q.  (2.11) 
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where  W  is  a  diagonal  mxm  matrix  of  weighting  or  normalizing 

factors  w. .  given  by 

n       -\  _;* 

w±i   =  (  I      a±.)     ~  *    (aiai)    ,  i=l,2,...,m    (2.14) 
j  =  l 

Eq.  (2.13a)  may  be  rewritten  as 

d  =  A*x  -  b*  (2.13b) 

where    A*  =  WA  (2.15) 

and     b*  =  Wb  (2.16) 

The  solution  of  (2.13b)  in  the  least-square-error  sense  is 
the  desired  solution,  namely 

x*  =  [A*TA*]_1  A*T  b*  (2.17) 

It  should  be  emphasized  that  this  method  solves  a  different 
set  of  equations  (2.13a),  derived  from  the  original  set  of 
equations  (2.1)  by  normalizing  each  equation,  using  the 
standard  least-square-error-solution.   Thus  the  normalized 
least-square-error  solution  is  a  weighted  least-square- 
error  solution  of  (2.1).   This  is  quite  different  from 
other  possible  solutions  of  (2.1)  minimizing  alternate 
cost  functions  [18,21],  i.e. 

m    „ 

T     -      V        2P 

1 "  i=l  * 

where  p  is  an  integer  or 
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m 


J~  =   E   |e. 
z    i=l 


The  following  example  demonstrates  that  the  solutions  (2.3) 
and  (2.17)  may  differ  appreciably. 

Example  2.1:   Solve  the  set  of  4  equations  in  2  unknowns: 


Given    b  =  Ax   ,   b 


n 

1 

1 

o 

-1 

1 

l     ' 

A    = 

10 

10 

ij 

L-io 

10 

According  to  (2.3) 


x  = 


.0 
.09 


J 


According  to  (2.8) 


■ 


x*  = 


.0 
.05 


These  results  as  well  as  the  lines  defined  by  the  above 
equations  are  shown  in  Fig.  2.2. 

This  new  approach  to  the  solution  may  yield  more  meaning- 
ful results  where  the  indiscriminate  use  of  the  least-square- 
error  procedure  leads  to  unexpected  or  undesirable  results. 
Consider  for  example  the  problem  of  estimating  the  parameters 
of  the  semiconductor  diode  model  [14]  from  measured  electri- 
cal data.   If  the  diode  is  forward  biased  the  model  essen- 
tially reduces  to  a  resistor  R,  representing  the  combined 
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-}-  LEAST  SQUARE  ERROR  SOLUTION 

0  NORMALIZED  LEAST  SQUARE  ERROR  SOLUTION 


FIG.  2.2     EX.  2.1 
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body,  lead,  and  contact  resistance  of  the  actual  diode, 
in  series  with  an  ideal  diode  given  by 


'I-I 


VD  =  k  in 


k£n(I)  -  k£n(I  )  . 


V   is  the  voltage  across  the  ideal  diode,  I  the  diode 

current,  I   the  reverse  saturation  current  and  k  the 
s 

characteristic  constant  (for  constant  temperature)  of  the 
diode.   If  V  represents  the  voltage  across  the  actual 
diode,  the  model  is  described  by 

V  =  IR  +  k£n(I)  -  kiln  (I  )  . 

An  estimate  for  the  constant  parameters  R,  k,  and 
-k£n(I  )  is  obtained  from  £n  measurements  of  voltage  and 
current  for  the  forward  range  of  the  diode.   The  measure- 
ments may  be  written  in  the  form 


vl 

1 

zi 

Jin  (I,)            -k£n(I    ) 
1                             s 

v2 

1 

J2 

£n(I2) 

• 
• 

• 
c 

• 
• 

■ 
c 

R 

V 

m 

• 

i 

• 

I 

m 

« 

£n(I    ) 
m 

_    k    j 

The  least-square-error  solution  results  in  a  model  which 
is  very  poor  for  small  currents  and  excellent  for  very  large 
currents,  because  the  weights  attached  to  the  measurement  vectors 
[1  I.  Jin  (I  )]  may  differ  by  orders  of  magnitude.   Therefore 
the  normalized  least-square-error  solution,  which  weighs 
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each  measurement  vector  equally,  is  preferred,  resulting 
in  a  model  which  describes  the  actual  diode  adequately  for 
the  entire  forward  range. 

B.   RECURSIVE  RELATIONSHIPS 

1.   Development  of  the  Recursive  Relations  for  an 
Estimation  Problem 

In  estimation*  problems  the  least-square-error 
solution  is  widely  used  in  the  form  of  a  recursive  relation 
for  sequential  estimation  where  in  addition  to  a  set  of 
solved  equations  one  new  equation  is  considered  and  its 
data  taken  into  account**.   A  typical  example  of  such  an 
estimation  problem  is  to  determine  the  set  of  initial 
conditions  for  a  dynamic  system  from  a  sequence  of  discrete 
observations.   Consider  the  system  equation  given  in  dis- 
crete form  as  follows: 

y_k  =  *(T)  yk_1  (2.18) 

zk  -  M  y_k  (2.19) 

where  y,  and  y.  ,  denote  the  state  vectors  at  discrete 
times  kT  and  (k-l)T  respectively.   $ (T)  is  the  known 
transition  matrix  of  the  system  for  the  discrete  time  inter- 
val T.   M  is  the  known  observer  matrix  of  dimensions  1  x  n 

and  z,  is  the  scalar  observation  at  time  kT.   If  the  state 
k 


* 
The  term  estimation  is  used  to  designate  problems  in- 
volving the  solution  of  (2.1)  where  the  elements  of  A  are 
known  exactly. 

** 

See  Ref.  3,  page  51. 
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vector  at  k  =  0  is  denoted  by  the  vector  x  the  observations 
can  be  listed  in  the  following  form: 


M 


M$ 


k-7 


x 


(2.20) 


Estimation  of  the  vector  x  is  equivalent  to  solving  (2.1). 
Thus  in  general  form  (2.20)  may  be  written  as 


z,  =  A,  x 
— k    k— 


(2.21) 


If  A,  is  of  maximum  rank,  Lee  [3]  defines  (M,<J>)  as  an 
observable  pair  and  (2.21)  has  a  least-square-error  solution 
according  to  (2.3): 


-k  "  PkAk-k 


(2.22) 


where  z,  is  the  vector  of  k  observations 
— k 

T    t|  T  T '     k-1  Tl 
A,  the  k  x  n  coefficient  matrix,  A,  =  [M  i  $  M  ;  . . .  ,$    M 
k  k  ' 

T 
P,  the  inverse  of  the  matrix   A,  A,   and 

x,  the  least-square-error  solution  for  x  based  upon 
the  k  observations 


Now  consider  the  (k+l)st  equation 


T 
Zk+1  =  *  * 


(2.33) 


where  z,  , ,  is  the  new  observation  or  data  in  the  vector  z,  , , 

k+1  -k+1 

T  T               k+l 

and  a   a  row  of  new  coefficients.  a   is  equal  to  M$   for 
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the  estimation  example  above.   It  is  now  possible  to  define 

the  matrix  A.  ,  and  the  observation  vector  z,  , , . 
k+1  — k+1 


A 


k+1 


A 


*k+l 


ik 


Jk+1 


(2.24) 


The  least-square-error  solution  then  takes  the  form 


-k+1  :  Pk+1  Ak+1  -k+1 


where  P,  ,  ,  is  the  inverse  of  the  matrix  [A,  ,nA,  ,, 
k+1  k+1  k+1 


(2.25) 
] .   This 


result  can  be  written  in  the  following  recursive  form  [3] . 


X 


=  xn  + 


-k+1    -k 


Pk* 

T 

1+a  P.  a 
—  k— 


T   .  \ 
'k+1  "  *-  ^k} 


> 


^  Pk+1  =  Pk  " 


pk!iTpk 

T 
1+a  P.  a 

—  k— 


(2.26) 


It  should  be  noted  that  (1+a  P,  a)  is  a  scalar  and  is 

—  k— 

treated  accordingly.   The  derivation  of  (2.26)  normally 
available  in  the  literature  is  rather  involved  and  a  short 
derivation  which  has  been  developed  here  is  presented. 


-1    T 

P,   =  A,  A, 
k      k  k 


(2.27) 


-1  rp  iji  T~l  T 

pk+i  =  Ak+i3c+i  =  W  +  ^  =  pk    +  ^^ 


-1  T  "I 

P,  .  ,    =    [I    +    a  a   P.  ]    P. 


k+1 


--     k" 


(2.28a) 
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After  premultiplying  by  Pk  +  1  and  postmultiplying  by  P, 
(2.28a)  becomes 


Pi   =  Pn  ^i   +  Pt  _li   a  a   Pi 

k    k+1    k+1 k 


(a. 28b) 


Combining  (2.28b)  with  (2.28a)  yields 


P,  .,  =  P,  -  P,   [I  +  aaT  P.  J   aaT  P, 
k+1    k    k      —   k   k 


(2.29) 


T       -1 
The    expression    [I+aa    El       a  may   be    simplified. 

T  -1 

Let  £  =    [I    +    a  a     P    ]       a 

then           C(l+aTP,a)    =     [I    +    aaTP,  ]       a    (1    +    aTP,a) 
k  k  K"- 


=    [I    +    aaTP    ]  [a   +   aaTP  a] 


=    [I    +    aaTPR]  [I    +    aaT  Pk]a 


=    a 


Thus     £  =  a/(l  +  a  P.a)  and  (2.9)  reduces  to 


Then 


pk+l 

= 

T 
P,  aa  P, 
r     k —  k 

k    l+aTP,  a 
—  k— 

-k+1 

= 

T 

P        A        7 

k+1  *k+l  -k+1 

= 

T  -, 
P.  a  a  P. 
r     k k 

k    l+aTP1  a 

L       —  k— 

T 
Ak-k  +  zk+l- 

= 

I 

3  aT        Pk-     T^  _T 

k  k'k    l+aTP, 

-  "k"k-k  T 
a 

(2.30) 


,     pk£a  pka, 

:k+l  lPk-  "  77~Tl    i 
1+a  P,  a 
—  k— 
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=  2k  - 


pk*    t~         pki* 

■  a  *k  +  zk+i 


T 
1+a  P,  a 
—  k— 


T 

1+a  P,  a 
—  k— 


p  p\ 

k-   ,        T.  . 

-k+1  "  -k  +  nj_  t    (zk+l   -  -k} 
1+a  P,  a 
—  k— 


(2.31) 


(2.30)  and  (2.31)  are  the  desired  equations  of  the  recursive 
form  (2.26)  . 

By  comparing  (2.3)  and  (2.17)  the  recursive  relationship 
for  the  normalized  least-square-error  solution  follows 


x 


=  X*  + 


P*a* 
k— 


k+1   -k 


1+a*  P*a* 
—   k— 


fz* 

lzk+l 


p*    =  p*  _ 
^  k+1     k 


P,*a*a*  P* 
k—  —   k 

l+a*TP*a* 
—   k— 


(2.32a) 


Now  because  of  the  normalization 


T   -% 

z*+1  =  (a  a)    zk+1 


T 
i*    =  (a  a)    a 


-    / 


(2.32b) 


Substituting  (2.32b)  into  (2.32a)  yields 


V* 

-k+1 


p* 
v   k+1 


*k  + 


p*  _ 
k 


P*a 


a  a+a  P  *a 


a  a+a  P*a 


(a 


k+1 


ax*)  N 


(2.32c) 


From  (2.32c)  it  follows  that  explicit  normalization  of  each 
equation  is  unnecessary.   Instead  a  slight  modification  of 
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(2.26),  as  given  in  (2.32),  results  in  the  desired  algorithm, 
It  should  be  noted  that  (2.32)  is  not  valid  for  the  meaning- 
less observation  with  the  coefficient  vector  a  =  0,  which 
must  be  excluded  from  the  recursive  procedure.   For  this 
case 


x 


=  X* 


■k+1   -k 


)  for  a  =  0 


p* 
k+1 


P£ 


(2.32d) 


In  the  following  example  sequential  estimates  for  a 
vector  of  constant  elements  are  computed  from  noise  con- 
taminated measurements  using  both  the  least-square-error 
and  the  normalized  least-square-error  procedure.   The 
measurement  noise  is  derived  from  a  noise  population  with 
zero  mean  and  a  distribution  of  finite  extent  (i.e.  uniform 
distribution),  rather  than  from  a  normal  distribution,  in 
order  to  conform  with  practical  problems  where  the  largest 
possible  measurement  error  is  bounded. 

Example  2.2:   Estimate  the  unknown  vector  x  of  dimensions 
2x1  from  the  scalar  measurements  given  as 


zk  =  \±   +  \ 


2.33a) 


where 


k-1  2 
Mk  -  [1.  (Sjji)  ] 


(2.33b) 


M,  is  a  time  varying  observation  matrix,  z,  the  scalar  ob- 
servation and  v,  the  measurement  noise.   At  time  instant  k 
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the  k  equations,  according  to  (2.33a),  may  be  written  in 
matrix  form  as 


z,  =  A,  x  +  v, 
— k     k—   — k 


(2.33c) 


where 


A,  = 


0 
1/4 


,k-l,  2 

(    k  ' 


(2.33d) 


-k  :   [zlz2  * ' *  zk] 


(2.33e) 


\  =  [V1V2  •'•  Vk] 


(2.33f) 


Since  x  must  be  estimated  from  noisy  measurements,  z,  ,  it 
is  necessary  to  solve  the  equation 


z,  =  A.  x 
— k     k— 


(2.33g) 


The  solution  of  C*\33g)  for  the  estimate  of  x  is  given  by 


T 
-k  ~  PkAk-k 


where    P, 


T    -1 

(AkV 


(2.33h) 
(2.33i) 


The  estimation  error  is  then  defined  as 


x, 


x,  -  x1 

— k    — k 


T 
PkAk-k 


T 
-  P.  A,    Z, 

k  k   — k 


V 


pkAk^k 


(2.33k) 
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0.1 
0.9 


and  the  measure- 


Results  for  a  specific  case,  where  x  = 
ment  noise  v,  is  a  sample  from  a  uniform  noise  population 
with  maximum  deviation  ±0.1  and  zero  mean  are  shown  in 
Figure  2.3  and" Figure  2.4.   The  sum  e    of  the  absolute 
estimation  errors 


l~l,    |  2 
e  =   x,   +   x, 

1  k '    '  k 


(2.33£) 


-vl      ~2 

where    x,  and  x,  are  the  two  elements  of  x,  is  shown  for 

both  estimators  in  Figure  2.5.   Note  that  the  estimation 
error  does  not  approach  zero. 

The  experimental  result  of  Example  2.2  may  be  verified 
by  considering  a  similar  estimation  problem  where  the 
sequence  of  observation  matrices  is  given  as 


Ak  " 


(2.33m) 


The  matrix  P,  in  (2.26)  for  this  problem  takes  the  form 


Pk  = 


-1    1+ 


-1 

1 


for  k  >  2 


In  the  limit  as  k  goes  to  infinity 


£im 
k->-°°  k 


1    -1 

L-l      1 


(2.33n) 
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and 


1 

-1 

1 

1 

1 

£im   ~ 

k-*-00   — k 

• 

-1 

1 

0 

1 

1 

V-   (2.33o) 


Since   v.  is  a  random  variable,  it  follows  from  (2.33o)  that 


£im 
k+°° 


{Prob  (x-x)  =  0}  =  0 


and  the  estimation  error  will  approach  a  constant  bias 
dependent  on  the  value  of  the  first  noise  sample.   In 
Example  2.2  the  result  is  quite  similar  and  the  estima- 
tion error  will  approach  a  constant  bias  with  probability 
equal  to  1.   This  bias  is  dependent  mainly  upon  the  first 
few  noise  samples  as  shown  in  the  following. 


k       1  2 
2  (1  -  =-) 

2 

T 
A  A  = 


2  (!-=■) 
2     x 


z  (1-4^) 

2       x 


and   P,  = 
k 


1 

k           1    4 
2           1       " 

"~  k           12" 
1    E    (1-t-) 

k    2 

2 

1  k    1  4    1  k    1  2 

k|  d-i)2     i   i 


(2.33p) 
In  the  limit  as  k  goes  to  infinity  the  elements  of  P,  do  not 
go  to  zero  but  approach  constant  values.   An  approximation 
of  ,  1™  P,  may  be  obtained  by  converting  the  piecewise- 
constant  functions  in  (2.33p)  to  continuous  functions  and 
the  summations  to  integrations.   Thus 
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Since 


Also 


and 


k^-oo   k    o     v      1'         k-*°°   k    ^   i'  x 


*im    {hx   -    4£nx   -    i  +    2y--iT]k+'5} 
k"w      k  X        x2       3x3    1.5 


=    1-4   ,£im    [i  £n(k+.5)  ] 
k->°°      k 


=    1 


2 

5,1m   1    ,     ,,  i         Urn    r,  -1,  n    L         1  ,,    1.  -,-, 

,  r-   Jln(k)    =   ,  lk       [  (I-?-)    +   tt(1-H       +    ...J) 

k-*°°   k  k+°°  L         k  2         k 


=    0 
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k->°°      k  l  x    ,    _ 

1 .  _> 
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k+oo       Lk 


=  1 


*"*      {Z       (1_1)  _     1      [E       (i-l)2]       } 

k+°°      ~  i  k    l  i'     J 
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Urn 
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&im 


c  -)  -i  K  i    .  D      -■  -|      K"t"  .  j     Z 

k_    Ux-^nx-f  +  V-V    s   -|[(x-2£nx-i)|  ]    } 

x        3x      1 . 5  1.5 


-    -(1.5    -    4£n(1.5)    -   ^-tt  + 


5  2  3 

■L,D         1.5Z         3x1. 5J 


2  2 

+ki~    (  (k+.5)   -  4£n(k+.5)  -   (k+^5)      +  4    (k*'  5)  In  (k+ .  5)-  ^[ In  (k+.5)]  } 


3.33   +   ttm  r    [^nk] 


3.33 


Since 


*im  i    [iln(k)]2=    4    *im    [k"^n(k)]2 


-  4  *i>^[(i-|>  +  |  d-i)2...]} 


=    0 


Thus 


lira 
k->°°      k 


3.33 


1      -1 
-1      1 


(2.33q) 


and   also 


£im 
k->°°   — k 


£im   „    AT 
.  P,  A.  v, 

k->°°      k   k— k 


1 
-1 


Z      i    (2-i)v. 
.     ,    l  11 

1=1 


(2.33r) 
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{V1  +  IV2+I  V3+T6    V4+2T    V"} 
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Eq.  (2.33)r  is  a  weighted  sum  which  places  emphasis  on  the 
first  few  noise  samples  and  whose  weighting  factors  approach 
zero  as  k  becomes  larger  and  larger.   Thus  for  large  k 
the  estimation  error  depends  mainly  upon  the  first  terms  of 
the  summation  in  (2.33) r  and  approaches  a  constant  bias. 

As  a  consequence  of  Example  2.2  the  assertion  in  Lee 
[Ref.  3,  page  53]  that  .^^  P,  =  [0]  is  contradicted  and  the 
estimate  obtained  is  not  consistent. 

2 .   Application  to  Identification  Problems 

The  foregoing  results  may  also  be  used  in  identi- 
fication* problems.   Consider  a  discrete-time  system  char- 
acterized by  the  discrete  equation 

x,  =  a,x.  , +a~x,  ~+...+a  x,   +b,u,  , +b~u,  ~+...+b  u, 
k     1  k-1   2  k-2       n  k-n   1  k-1  A    k-2       ro  k-m 

(2.34) 

where  in  general  n  >  m,  and  x,    and  u,    represent  the 
r         —         k-n      k-n   r 

system  response  and  driving  function  at  time  t  =  (k-n)T. 
It  is  desired  to  determine  the  coefficients  a.  and  b. 

i     : 

(where  i=l,2,...n  and  j=l,2,...m)  from  sequential  measure- 
ments of  the  output.    If  the  measurements  z,  are  noiseless 
then 


Z-.  X,     —     I  X-.      -,  •  •  *    X-,  Xji-,  n     •  •  •    U*     J 

k    k     k-1      k-n    k-1      k-m 


*n 
fan 


after  k  =  n+m  measurements  the  following  data  bank  will 
result  if  the  system  starts  with  zero  initial  conditions 


The  expression  identification  is  used  to  designate 
problems  involving  the  solution  of  (2.1)  when  some  or  all 
of  the  elements  in  A  are  random  variables. 
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n+m 


xQ       0 


Xl       X0 


n+m-1    n+m-2    m 


u, 


u. 


u. 


u     ,   u    „ .  .  ,u 
n+m-1    n+m-2     n 


>1~ 

• 

• 

an 
bl 

• 

• 

• 

b 
m 

(2.35) 


or 


z  ,   =  [x  ,   :  u   ] 

—n+m      n+m  ;  n+m 


— n 

b~" 
— m 


(2.36) 


The  identification  problem  is  then  to  solve  (2.35)  for  the 

I  -n 
vector  -r —  .   An  exact  solution  is  obtained  only  if  the 

b  ■* 

matrix  [X    !  U    ]  is  nonsingular.   Sequential  estimation 

n+m  J   n+m  *  ^ 

is  then  possible  using  (2.26)  or  (2.32)  when  the  number  of 

observations  K  >  m+n  and  the  matrix  [X    i  U  ,  ]  has 

—  n+m  j  n+m 

maximum  rank  equal  to  m+n. 

In  the  presence  of  measurement  noise  the  observations 
of  the  output  become 


zk+l    xk+l  +  Vk+1 


where  vv.-,  is  the  measurement  noise.   Eq.  (2.35)  may  then 
be  approximated  by 


0     .  .  .   0 
zrt    ...   0 


zk-l   zk-2 


Jk-n 


?1~ 

l°k 

a 
n 

"5i 

b 
n_ 

(2.37a) 
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or 


5k  "  !zk  i  V 


a 
— n 

b 

-m 


(2.37b) 


This  solution  is  proposed  by  R.C.K.  Lee  [3] . 

An  alternate  approach  is  to  use  an  estimate  for  the 
element  x.  in  (2 . 35) ,  denoted  by  x.  in  the  matrix  X,  .   Thus 

X  X  Jv 


r 

X.  =  [x. 


X. 


x-1      x-n 


u.  ,  ...  u. 

x-1      x-nr 


a 
— n 


i — m  J 


(2.38) 


This  approach,  as  shown  in  Example  2.3,  has  a  distant  advan- 
tage over  the  solution  of  (2.37b).   This  example  also 
demonstrates  the  difference  between  the  least-square-error 
and  the  normalized  least-square-error  solutions  as  given  by 
(2.26)  and  (2.32) . 

Example  2.3:   Identify  the  coefficients  a,  and  b,  in  the 
difference  equation 


xk+i  =  aiVbiuk 


(2.39a) 


from  noise  corrupted  measurements  z,  ,  ,  where 


zk+l  ~  xk+l  +  vk+l 


(2.39b) 


and  v.  is  a  sample  of  the  measurement  noise  with  the  follow- 
ing statistical  properties 


E  {v. }  =  0 

x 


E  {v.  •  v  . }  =  6 .  ..a 
i     3      i/J 


6.  . 


<:  #} 


}    (2.40) 
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2 

where  E{   }  denotes  the  expected  value,  and  a   is  the  vari- 
ance of  the  measurement  noise.   Specifically  consider  the 
discrete  equation 

xk+l  =  *9xk  +  -1  uk  (2.41) 

For  a  given  input  sequence  {u,  ...  u,  }  and  zero  initial 
conditions,  the  output  sequence  {xn,x,  ...  x,  }  is  generated 
using  (2.41).   The  output  sequence  is  then  corrupted  with 
noise,  taken  from  a  uniform  distribution  with  maximum 
deviation  ±0.1  with  zero  mean  to  obtain  the  sequence  of 
noisy  measurements  {z  ,z,  ...  z,  }  which  then  are  processed 

O    _L  .K 

according  to  Eqs .  (2.26)  and  (2.32).   For  both  approaches, 
one  using  the  z. 's  and  the  other  using  the  x. 's  of  Eq.  (2.38) 
as  elements  in  the  coefficient  matrix  X,  two  computations 
are  made  -  one  where  the  input  is  a  unit  step  and  one  where 
the  input  is  a  sampled  cosine  wave  of  unit  amplitude. 

Typical  results  using  the  same  measurement  data  for 
both  estimators  (the  least-square-error  and  the  normalized 
least-square-error)  are  shown  in  Fig.  2.6  through  2.17. 
Figs.  2.6  and  2.7  present  the  estimates  for  a,  and  b,  for 
the  least-square-error  and  the  normalized  least-square- 
error  solutions  for  a  step  input.   Fig.  2.8  presents  the 
magnitude  of  the  total  identification  error   e.  =  |  a,  (k)-a.,| 
+  |b,  (k)-b,  |  for  both  cases.   Figs.  2.9,  2.10,  2.11  present 
the  identifications  using  the  estimated  past  values  for  X, 
as  given  by  (2.38).   Figs.  2.12  through  2.17  present  the 
corresponding  data  when  the  input  is  a  cosine  function.   The 
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results  indicate  that  the  identification  error  is  generally 
smaller  when  the  normalized  least-square-error  method  is 
used  and  that  the  identification  error  depends  upon 
whether  the  input  function  causes  a  significant  system 
response.   In  the  following  it  will  be  demonstrated  that 
for  the  step  input  the  identification  error  approaches  a 
very  large  value,  compared  with  the  parameters  to  be  iden- 
tified, independent  of  the  variance  of  the  measurement  noise 
while  for  the  cosine  input  the  identification  approaches 
a  constant  bias  dependent  on  the  variance  of  measurement 
noise.   Consider  the  least-square-error  solution  of  (2.37b) 
which  takes  the  form 


n 


=  { [z,  !  u,  ]T[z,  !  u,  ]}   [z,  I  u,  ]T  z. 

k  i  k    k    k      k    k    — k 


m 


(2.42) 


where  k  >  m+n.   For  Example  2.3  this  may  be  written  as 


r~  i 

-k-1 

k-1            -, 

-i 

-k-1                — 

ai 

Z      z. 
i=0      x 

Z      u.  z  . 
i=0     x  x 

Z  z  .  z  .  .  , 
i=0      x    1+1 

•s 

k-1 

E      u. 
i=0      X 

k-1 

bl 

Z      u  .  z  . 

i=0      x   x 

Z  u. z . ^n 
i=0     x  1+1 

—           — 

_ 

i 

(2.43a) 
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or 


-1 


I  V   2 
r-  L        Z  . 

k  i=0   x 


1 k"1 

t-  Z   u  .  z  . 
k  i=0 


1  k_1 

r-   Z   U.  Z  . 

k  i=0 


i  k:x    2 

s-   2    U. 

k  i=o  x 


r-  2   z  .  z  .  , , 
k  i=0   x  1+1 


1  k"1 

t-   Z   u.  z  .  ,  , 

1  i=0  X  1+1 


(2.43b) 


In  the  limit  as  k  goes  to  infinity  the  summations  in  (2.43b), 
provided  that  u,  and  x,  remain  bounded,  take  the  following 


form 


£im  1   „  2  Jlim  1  ,  _  2    „  „  v  2, 

.     r-   E  Z:.  =  ,     r-  [  Z  x.  +  2   £  x.v.  +   Z  v.] 

k-*00  k    .  i  k->°°  k  .  A  1      -n  !1    •  n  i 

1=0  1=0         1=0  1=0 


Urn  1   k„1   2     2 
,    ,—   Z   x.  +  a 
k—  k  i=Q    i 


£im   1      ,,                        £im   1  r    „  _                 t 

,          ,—      Z       u.z.=1          r-  Z  U.X.+       Z      u.v.j 

k->™  k    .    n      11        k->°°  k  .    A  11         .    A      11 

i=0  i=0  1=0 


£im   1      „ 

,  r-        Z         U.  X. 

k->°°  k    .    A      11 
i=0 


.  ^  t  k-1  0  .      ,    k-1  k-1  k-1  k-1 

icim  1     „  ximl,  „  „  „  n 

.        -r-     Z    z  ,  z.    ,  =  r-L^    x.  x.  .  _  +     Z    x. v  .  , , +    Z    x.  ,  ,  v.  +    Z    v  .  v  .  ,  ,  J 

k->«>  k  .    A    i  l+l     k->°°  k    .    A    i  l+l         A    l    l+l         _    l  +  l  i         A    l    l+l 
1=0  i=0  1=0  i=0  i=0 


Jlim  1      „ 

-*00   k    .     n      l    l  +  l 
1=0 


im  1      „  £im   1    r    „  v  , 

,         .—     Z      u .  z  .     ,    =   ,         r-    [    Z      u.x.+      Z      u .  v  .  ,  , 
k->°°  k    ±_         i    l+l         k+°°   k    li_Q      l    l         i_Q      i    l+l 


£im   1      „ 
,  =-      Z      u .  x . 

k->°°  k         A      l    i 
1=0 
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Then  for  very  large  k  the  following  relation  is  obtained 
from  (2.4  3b) 


9        9 

E   x7  +  kaz 
i-0   X 


k-1 

E   u.  x. 
i-0  x  x 


u-1 

E   u .  x. 

i=o  x  x 


k:x    2 

E   u. 

1=0  x 


k-i        n 

£   x. x. . , 
i^o  -1-  1+1 


(2.44) 


k-1 
=0 


j  E   u. x. , 
Li=r 


The  corresponding  expression  for  noiseless  observations  is 


E   x. 
i-0   x 


k-1 

E   x.  u. 
i-0   x  1 


k-1 

E   u.  x. 
i-0  x  x 


k-1 
E    2 

1  =  0    1 


k-1 

£   x. x. . , 
i-0   x  1+1 


k-1 

E   u. x. . , 
.i-0   X  1+1. 


(2.45) 


or 


:  k  1   2     2 

E   x  .  +  ko 

i-0   X 


k-1 
E   u.  x. 

Li=o  x  x 


k-1 

E   u.  x. 
i-0   x  X 


k-1 
E   x.x.+1    ka  a;L 


!  + 


k-1   . 

E   u 

i=0   ■ 


J    ^ 


k-1 

L-o UiXi+i- 


(2.46) 

The   identification  error  for  very  large  k  is  then  obtained 
by  combining  (2.44)  and  (2.46) 


k  1   2    2 

E   x . +ka 
i-0   x 


k-1 

E   u.  x. 
i-0   *  X 


k-1     1 

E   u.x. 
i-0   x  X 


k-1   . 

E  u' 

i=0   ■ 


1 


al"al 


i  2 
ko  a. 


brbi 
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or 


■l 

bl 

= 

al 

bl 

- 

al 
bl 

"k-1  2         k-1   "7 

E  x. +ka       E  u. x. 
i=0  X  i=0  X  X 


k-1 

E  u.  x. 
i-0  x  x 


k-1  . 

E  u 

i=0  ■ 


-fca  a. 


(2.47) 

In  the  limit  as  k  goes  to  infinity,  Eq.  (2.47)  is  easily 
evaluated  by  considering  the  corresponding  continuous  system 
with  the  transfer  function 


x(s)   ,   1 
u(s)    s  +  1 


(2.48) 


Integration  of  (2.48)  yields  (2.39a)  when  the  forcing  func- 
tion u(t)  is  approximated  as  piecewise  constant  and  the 
sampling  time  At  satisfies  the  relation 


-At 
e    =  a. 


or 


At   =   £n  a. 


(2.49) 


For  the  step  input,  when  the  system  is  initially  at  rest 


u(t)  =  1 


x(t)  =  1  -  e 


(2.50) 


-t 


Then 
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(2.51) 
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and 
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(2.52) 


also 


—      — 

—      — 

—      — 

i—         - 

£im 
k->°° 

al 

Lbx_ 

= 

Lbx_ 

£im 
k->°° 

k- 

= 

0 

(2.53) 


This  demonstrates  that  identification  of  system  parameters 
from  a  step  input  is  only  feasible  as  long  as  the  system 
response  is  not  close  to  the  final  value,  as  shown  in 
Fig.  2.6. 
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Similar   considerations   yield    the   estimation   error    for 
the    unit   cosine    forcing    function.      Using    (2.48)    and 


U(s) 


s2  +  l 


(2.54) 


it    follows 


x(s)    = 


h      ■.  \s+h 


(s>l)  (s+l) 


s+1 


+ 


s2+l 


(2.55) 


and 


u (t)    =    cos (t) 


x(t) 


(cos    t   +   sin    t   -   e      ) 


(2.56) 
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The  identification  error  as  shown  in  Eq .  (2.58)  approaches 
a  constant  bias  which  is  reasonably  small  for  a  large 
signal  to  noise  ratio.   For  the  specific  example  (Ex.  2.3) 
this  bias  as  obtained  from  (2.47)  for  very  large  k  is 

-.0212' 
L  .0106 

for  the  least-square-error  solution  and 


L^J 


-.016  3" 
.0054. 


for  the  normalized  least-square-error  solution.   The  total 
bias  error  reduction  for  the  normalized  least-square-error 
solution  is  approximately  32%  compared  with  the  least- 
square-error  solution.   This  agrees  with  the  experimental 
results  obtained  as  shown  in  Fig.  2.14. 
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LEAST   SQUARES    IDENTIFICATION 

-   COS.    INPUT   -  UNCORRECTED  COEFFICIENTS 
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III.   THE  BEST  APPROXIMATE  SOLUTION 

In  estimation  theory,  quite  frequently  one  has  to 
solve  a  set  of  inconsistent  or  insufficient  specified 
linear  equations.   Since  in  these  cases  an  exact  or  unique 
solution  does  not  exist,  an  optimal  or  best  approximate 
solution  has  to  be  accepted.   Penrose  [4,5]  has  defined 
this  best  approximate  solution  as  follows. 

Definition  1:   X   is  the  best  approximate  solution  of 
the  linear  equation 

f (X)  =  G,  (3.1) 

where   X   and  G   are    rectangular  matrices,    if    for   all   X  ^   X 


o 


either 


||  f  (X)    -    G||>||f  (X    )-G||  (3.2a) 

or  ||  f  (X)-G||  =  ||f  (XQ)-G||         and    ||  X  ||>_||Xo  ||       ,  (3.2b) 

T 
where  ||x||  denotes  the  norm  of  X  defined  as  ||  X  j|  =  trace  X  X. 

In  the  discussion  which  follows,  X  is  restricted  to  be 

a  vector  of  dimensions  n  x  1  with  real  elements  denoted  by 

x.   Then  Eq .  (3.1)  may  be  written  as 

Ax  -  b  (3.3) 

where  A  is  an  m  x  n  matrix  and  b  is  an  m  x  1  vector. 

The  best  approximate  solution  for  x,  according  to 
definition  1,  is  the  least-square-error  solution  if  A  has 
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maximum  rank,  and  the  minimum-norm  least-square-error 
solution  if  A  has  rank  less  than  maximum.   This  solution  is 
obtained  by  using  a  generalized  matrix  inverse  developed 
by  Penrose  [4],  and  denoted  in  the  work  which  follows  as 
the  pseudo  inverse.   The  solution  to  (3.3)  is  thus  written 
as 

x  =  A+b  (3.4) 

where  x  is  the  best  approximate  solution  and  A  the  pseudo 
inverse. 

In  the  following  the  definition  and  some  properties  of 
the  pseudo  inverse,  as  given  by  Penrose,  are  stated.   Then 
some  alternate  expressions  for  the  pseudo  inverse  are  dis- 
cussed.  Finally  a  new  recursive  formulation  for  the 
sequential  solution  of  Eq.  (3.3)  is  presented.   This  form- 
ulation has  the  advantage  over  previously  published  results 
(Wells  [6])  in  that  the  dimensions  of  the  matrices  in  the 
algorithm  remain  constant  irrespective  of  the  size  and  rank 
of  A.   A  flow  diagram  for  the  computation  of  the  algorithm 
is  presented  and  illustrated  with  a  numerical  example. 
Finally,  the  recursive  algorithm  is  adapted  to  the  problem 
of  estimating  the  states  of  time-varying,  linear  systems 
from  noise-contaminated  measurements. 

A.   THE  PSEUDO  INVERSE 

1 .   Definition  and  Properties 

Penrose  [4]  defines  the  following 
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Definition  2:  Four  matrix  equations  are  defined 

AYA   =  A  (3.5a) 

YAY   =  Y  (3.5b) 

[AY]  =  AY  (3.5c) 

[YA]*=  YA  (3.5d) 

where  *  denotes  the  conjugate  transpose.   These  equations 
have  a  unique  solution  for  Y.   This  solution  is  called  the 
pseudo  inverse  and  is  denoted  by  Y  =  A  . 

The  essential  feature  of  this  definition  is  that  any 
expression  for  the  inverse  of  matrix  A  is  established  as 
the  unique  pseudo  inverse  if  and  only  if  it  satisfies 
Eq.  (3.5).   As  a  consequence  of  definition  2  the  pseudo 
inverse  has  the  following  properties 

A++  =  A  (3.6a) 

A*+  =  A+*  (3.6b) 

A  =  A    if  A  is  nonsingular            (3.6c) 

(AA)+  =  A_1A+  (3.6d) 

(A*A)+  =  A+A*  (3.6e) 

+   +  + 

A,A*A,A  ,A  A   have  rank  equal  to  the  trace  of  A  A 

In  addition,  for  completeness,  define  [1] 

0+  =  0T  (3.6f) 

2 .   Alternate  Expressions  for  the  Pseudo  Inverse 
It  is  desirable  to  be  able  to  express  this  inverse 
by  a  mathematical  formula  and  the  following  results  are 
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essentially  available  in  the  literature  as  discussed  by 
Deutsch  [1] ,  Koenig  [6],  et.  al. 

a.   Overdetermined  case,  m>n,  r=n 

As  shown  in  Chapter  I,  the  solution  (3.4)  is 
obtained  using 


A+  =  [ATA]   AT 


(3.7) 


which  corresponds  to  the  minimum  mean-square-error  solution 
of  (3.3). 

b.   Underdetermined  case,  m <n ,  r=m 

The  solution  (2.4)  is  obtained  using 
-1 


A+  =  AT[AAT] 


(3.8) 


which  corresponds  to  the  minimum-norm  solution. 

Equation  (3.8)  satisfies  definition  2  and  is  thus  the 
desired  pseudo  inverse.   The  fact  that  the  solution  is  the 
minimum-norm  solution  can  be  demonstrated  geometrically  for 
the  three-dimensional  case  as  follows: 

Given  two  equations  in  three  unknowns 


~al 

a2 

a3~ 

X 

y 

— 

"cl" 

_bl 

b2 

b3- 

z 

_C2_ 

(3.9a) 


Find  the  minimum-norm  solution  for  the  unknown  vector 

T 
[x  y  z]  . 

Eqs .  (3.9a)  represent  two  planes 

r  •  a  =  c. 


r  •  b  =  c. 


(3.9b) 
(3.9c) 
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where    r  is  the  position  vector  from  the  origin  to  the 
point  (x,y,z) 

a  a  vector  with  components  a.,  ,  a~ ,    a~ 

and     b  a  vector  with  components  b, ,  b~,  b^ 

Normal  vectors  to  the  planes  are  given  by  a  and  b 

Then  any  point  on  the  line  of  intersection  of  the  two  planes 
satisfies  (3.9b)  and  (3.9c)  and  the  desired  solution  is  the 
point  on  the  line  of  intersection  closest  to  the  origin. 
Let  the  vector  from  the  origin  0  to  this  point  N  be  desig- 
nated by  ON  (see  Fig.  3.1).   ON  is  a  linear  combination  of 
the  vectors  a  and  b 


ON  =  y-^a  +  y  ?k 


al   bl 


ON   ,a2   b2 


a3   b3 


(3.9d) 


where  y.    and  Y?  are  scalars,  which  are  determined  from  the 
condition  that  ON  has  to  satisfy  (3.9b)  and  (3.9c)  as  the 
position  vector  r. 
Thus 


(Y]_a  +  Y2k)  *  1  =    ci 


(Y^  +  Y2k)  *  k  =  c2 


and 


Y- 


Y 


2^ 


a«a    a«b 
a«b    b*b 


-1 


Lc2 


(3.9e) 
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LINE  OF  INTERSECTION 


FIG.  3.1    MINIMUM  NORM  SOLUTION     ON 
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Equation  (3.9d)  may  be  written  in  component  form  using 
(3.9e)  as 


XN 

^N 

= 

_ZN_ 

al  bl 

a2   b2 
a3   b3 


Z  a 
i=l  ' 


L    a.b. 
i=l  >~    x 


Z  a.b.    E  b. 
•  ill   .  ,  i 


-1 


_C2j     (3.9« 


or 


V 

— 

ai 

bl 

r 

"      ^ 

a2 

a3 

% 

= 

a2 

b2 

it 

^1 

b2 

b3_ 

_ZN_ 

_a3 

b3_ 

al  bl 

a2   b2 


^a3 


If  the  matrix  A  is  defined  as  in  (3.9a) 


A 


b. 


-1 


.1 


(3.10a) 


(3.10b) 


the  result  (3.10a)  may  be  written  in  the  form  of  (3.4) 


XN 
ZN 


=  A 


Lc2. 


(3.10b) 


+     T    T  -1 
where  A   =  A  [AA  ]   . 

This  shows  that  in  this  case  the  pseudo  inverse  in  (3.4) 

results  in  the  minimum-norm  solution. 

c.   Underdetermined  case,  m>_n ,  r<n  or  m<_ n ,  r<m 

The  solution  (3.4)  is  obtained  using  either 

-1 


,  m         rp     m       -   rp 

A   =  AN  [N  AA  N]   N 


(3.14a) 
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or       A+  =  MT[MATAMT]   MAT  (3.14b) 


which  corresponds  to  the  minimum  norm  solution  of  minimum 
square  error.  Matrices  N  and  M  are  defined  as  factors  of 
A  [1] 

A  =  N-M  (3.11) 

where  matrix  N  is  of  dimension  m  x  r,  and  matrix  M  is  of 
dimension  r  x  n.   The  rows  of  N  are  chosen  such  that  they 
constitute  a  set  of  base  vectors  for  the  column  space 
spanned  by  A.   Matrix  M  is  then  the  transformation  of  N  to 
A.   Its  dimensions  are  necessarily  r  x  n.   For  example,  the 
columns  of  N  might  be  chosen  as  all  the  independent  columns 
in  A.   The  pseudo  inverse  is  then  given  as 

+      rp     T-l     T-1T 

A   =  M  [MM  ]     [N  N]   N  (3.12) 

because  (3.12)  satisfies  all  four  equations  in  definition  1 
as  indicated  below: 

-l-  T>  T-l  T-1T 

(1)  A   A   =    M    [MM    ]        [N   N]       N   NM 

T  T-l 

=    M     [MM    ]       M 

=     [A+A]T 

-(-  fp  rp-lT-lT 

(2)  AA      =    NM  M    [MM    ]        [N  N]       N 

T       -IT 
=    N[N    N]       N 

=     [AA+]T 

(3)  AA+A    =    NM   MT[MMT]~1[NTN]~1NTNM 

=    NM 
=    A 
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+        +  T  T    -1         T        -1      T  T  T-1T  T 

(4)       A    AA      =    M     [MM    ]         [N    N]       N    NMM     [MM    ]         [N    N]       N 

T  T    -1       T       -1    T 

=    M     [MM    ]         [N    N]        N 

=    A+ 


Expression  (3.12)  which  involves  two  matrix  inversions  may 
be  simplified  further  to  expressions  involving  only  one 
matrix  inversion.   Since  both  matrices  M  and  N  have  rank  r, 
(3.11)  can  be  solved  for  either  one 

M  =  [NTN]-1NTA  (3.13a) 

N  =  AMT[MMT]"1  (3.13b) 
Substitution  of  (3.13a)  into  (3.12)  yields 

A+  =  ATN[NTAATN]-1NT  (3.14a) 

and  substitution  of  (3.13b)  into  (3.12)  results  in 

A+  =  MT[MATAMT]"1MAT  (3.14b) 

All  expressions,  (3.12),  (3.14a),  and  (3.14b),  are  valid 
general  expressions  for  the  pseudo  inverse.   However,  if 
the  matrices  involved  in  actual  computation  are  to  be  as 
small  in  dimensions  as  possible,  (3.14a)  and  (3.14b)  should 
be  used  as  follows: 

( 1 )  m<n ,    r>jn 

■  rp  mm  _  1       rp 

A      =    A   N[N    AA   N]       N 

( 2 )  m>n ,    r<_ n 

i  m  m  m      _  l  m 

A      =    M     [MA   AM    ]       MA 
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B.   RECURSIVE  ALGORITHM  FOR  THE  SEQUENTIAL  LEAST-SQUARE  FIT 

In  Chapter  I  a  recursive  relationship  for  sequential 
estimation  based  on  the  equation 

b  =  Ax  (3.15) 


and  its  least-square-error  solution 

x  =  [A1  A]  ■LAib  (3.16) 


is  given.   However  this  recursive  form  is  only  valid  if  the 

T  T   -1 

matrix  [A  A]  is  nonsingular  such  that  its  inverse  P  =  [A  A] 

exists . 

Consider  now  the  case  where  no  assurance  as  to  the 

existence  of  the  inverse  can  be  given.   Using  the  pseudo 

inverse  it  is  possible  to  write  formally 

x  =  A  b 

+    +  T 
=  A  [AA  ]    h 

■  mi  m  (3.17a) 

=  A  A1  AXb 

=  [ATA]+ATb 

T 
or      x  =  P  A  b 

T 
where  P  is  the  pseudo  inverse  of  [A  A] .   The  dimensions  of 

the  matrix  P  are  always  n  x  n  independent  of  m  or  r.   This 

is  significant  because  in  a  sequential  procedure  both  m  and 

r  increase  as  more  data  are  incorporated.   Therefore,  the 

use  of  an  algorithm  updating  the  matrix  A  ,  as  proposed  by 

T.N.E.  Greyville  [22],  may  not  be  practical  for  sequential 

estimation  because  the  dimension,  m,  of  A  grows  at  each 
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step.   Alternate  methods  [7,8]  using  an  updating  procedure 
for  A   until  the  matrix  A  has  dimension  n  x  n  are  considered 
below: 

(1)  Direct  updating  of  A  . 

At  each  step,  as  additional  measurements  are  incorpora- 
ted, the  size  of  A  grows  one  column  for  each  step.   C.  H. 
Wells  [7,8]  presents  an  algorithm  for  the  updating  of  A  . 
However  his  procedure  has  the  disadvantage  that  when  initi- 
ating an  estimation  problem  the  rank  of  A  must  increase  at 
each  step  until  maximum  rank  is  reached.   This  is  not  the 
case  if  the  first  n  equations  in  (3.15)  are  dependent. 

(2)  Updating  A   using  (3.8)  as  long  as  m  <n . 

A  recursive  algorithm  or  direct  computation  based  on 

T 
(3.2)  is  possible  only  if  the  square  matrix  AA  ,  of  growing 

dimensions,  remains  non-singular.  Thus  the  rank  of  A  has 
to  increase  at  each  step  which  may  not  be  true.  Further- 
more, a  recursive  form  could  be  used  only  as  a  starting 

T 
procedure  up  until  the  matrix  AA  has  dimensions  n  x  n. 

Consequently,  it  is  desirable  to  find  a  recursive 

formulation  similar  to  (2.26),  where  all  matrices  involved 

have  constant  dimensions  regardless  of  rank  or  size  of  A. 

This  result  is  accomplished  here  with  a  recursive  form  for 

T 

the  matrix  P,  where  P  is  the  pseudo  inverse  of  A  A. 

In  order  to  derive  this  recursive  algorithm,  consider 
the  set  of  equations 


^k 


(3.18a) 
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where  the  subscript  k  denotes  the  number  of  equations. 
Assume  the  solution  of  the  form  (3.17a) 


~       T    +  T 
x,  =  [A,  A,  ]  A,  zn 
— k      k  k    k  — k 


T 
P,  A,  z, 
k  k  — k 


(3.19a) 


where  P,  is  the  pseudo  inverse  of  A,  A,  and  xn  the  best 

k        r  k  k     — k 

approximate  estimate  for  x  based  upon  the  last  k  equations. 
Then  at  the  next  step  (3.18a)  takes  the  form 


*k 


L'k+l-J 


Ak+1-  " 


_k_ 

T 
a  J 


x 


(3.18b) 


T  . 
where  z,  , ,  is  a  new  scalar  measurement  and  a   is  a  row 
k+1  — 

vector  of  coefficients  relating  the  observation  z,  ,  to  x. 


The  solution  to  (3.18b)  is  then 


T 
-k+1  :  Pk+1  Ak+1  -k+1  • 


(3.19b) 


In  order  to  find  an  alternate  expression  for  (3.19b)  let 


x  =  x,  +  Ax 


(3.20) 


Substitute  (3.20)  into  (3.18b)  and  premultiply  with  A,  , 
to  obtain 


k+1 


:k+l 


=  Ak+1  Ak+1 (^k  +  A^} • 


or 


a,  z,    z,  i  a  — 
k— k    k+1— 


[ARAk  +  a  a  ]  (xk  +  Ax) 
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But 

T       + 

-tit  Ai   —   it-,    • 

k  k     k 

Thus 

A,Tz.  -  P.+x,  +  (z,  .,  -  aTx.)a  =  [P,+  +  aaT]Ax    (3.21a) 
k— k    k— k     k+1    —  — 1  —     k — 

T        +  * 

The  term  A,  z_,  -  P,  x,  can  be  shown  to  equal  the  null  vector 
as  follows.   According  to  the  defining  equations  of  the 
pseudo  inverse,  (3.5), 

T     T  T+  T 
k    k  k   k 

i        m   m 

"  [AkV  Ak 

+    T 
=  AkAkAk 

+      +      T 
=  A,  [A.  A,  ]A,  A, 
k   k  k   k  k 

=  A+A+TATA  AT 
AkAk  AkAkAk 

rp       ■      m         m  i    rn 

=  [A,A,  ]   [ATA,  ]Ar  =  PvP^Ar  .  (3.22) 

k  k     k  k   k     k  k  k 

Also  since  P.  =  P^  and  P,+  =  [P.+  ]T 
k     k       k      k 

P  P+  -  [P  P+ 1 T 
k  k    L  k  kJ 

=  P+TPT 
k   k 

■  Vk  (3-23) 

Then  using  (3,22)  and  (3.23), 

Ak-k  "  Pk-k  =  Ak-k  "  PkPkAk-k 
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=     [I    -    P*P,  ]A?z. 
k   k      k— k 


=  [I  -  pkpk]AA 


■  [I  "  PkPk]PkPkAk5-k  "  2.  •         (3-24) 


Then  Eq.  (3.21a)  reduces  to 

(zk+l  "  -T^k)  -  =  [Pk  +  --T]  A-*  (3.21b) 

The  solution  of  (3.21b)  for  Ax  is  obtained  as  follows. 

(1)   If  P.  has  rank  r  <  n   and   [I  -  pk?k]  a  ¥■   0 
the  solution 

(i)      [I_pkpk]  a  T 

AxU;  =     K  ? (z     -  aXx  )  (3.25) 

aT[I-P^Pk]a   k+1   "  _k 

satisfies  (2.21b)  by  inspection.   Ax    is  not  defined  if 
either  P,  is  of  rank  n,  which  implies  that  [I -P.P.]  =  0  ,  or 


k  k 


if  tI"P^Pk3  a  =  0. 


(2)   If  Pk  has  rank  r  =  n ,  or  if  [I-p£p.]a  =  0 

the  solution  is  given  by 

P  a 
Ax(2)  =  jpF—  (zk+1  -  aTxk)  (3.26) 

1+a  P.  a 

—  k— 

Substitution   of    (3.26)    into    (3.21b)    yields 

T- 

[P^+aa1]    Axu;    =   -^^ £   {a-   (a/p    a)a-   [I-P^ja}     (3.27) 

1+a   P.  a 

—     k— 
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Since  either  [I-P,  P,  ]  =  0  when  P,  has  rank  r  =  n,  or 

k  k  k 

[I-P,P,  ]a  =  0  (3.27)  reduces  to  (3.21b).   Using  the  above 
two  possible  solutions  for  Ax,  which  according  to  the 
conditions  (1)  and  (2)  are  mutually  exclusive,  recursive 
forms  for  the  solution  of  (3.18a)  are  established. 


(1)  Recursive  relation  if  P,  has  rank  r  =n ,  or  if 
[I-P+Pk]a  =  0 
From  (3.20)  and  (3.26)  , 

*k+l  =  *k  +  .   T~    (zk+l  "  ^k}  (3'28a) 

1+a  P.  a 
—  k— 


Combining    (3.28a)    and    (3.19b)    determines    the    updating   pro- 
cedure   for    the   matrix   P,   , n : 

k+1 

T 

T                Pk--  T  Pk- 

X,   ,,     =    P,  A,  2,     -    — ^ P,  A,  z,     +    z. 


-k+1  k   k-k         ..    T„  k   k-k  k+1    ,,    Tn 

1+a   P,  a  1+a   P,  a 

—     k—  —     k— 

T 

Pi, m ]      A,    Z,        +      Z 


k         .  ,    T_  k-k  k+1    ,  ,    TV 

1+a   P,  a  1+a   Pa 

—       K —  K 


Since 


P,aa  P,  P,  a 

[Pv    -      k~~     k]    a   = 


Also 


V  T1  —  T 

1+a  P.  a  1+a  P.  a 

—     k—  —     k— 


T_ 

*k+l    =    [Pk    "    ,*>p   k]     tA%   +    Zk+1^]     ' 
i.  +  a   r,  a 
—     k— 


*k+l   =   Pk+1    [Ak^k    +    Zk+1^] 
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Thus  , 


pk+i  ■  pk 


T 
P,  a  a  P. 
k k 

T 
1+a  P,  a 

—  k— 


(3.28b) 


The  new  matrix  satisfies  the  defining  equations  for  the 
pseudo  inverse  (3.5).   Thus  (3.28a)  and(3.28b)  constitute 
the  required  recursive  form 


x 


■k+1 


k+1 


*k  + 


Pk  " 


Pk* 


T 
1+a  P.  a 

—  k— 

T 
P,  a  a  P. 
k k 

T 
1+a  P.  a 

—  k— 


(z 


k+1 


T-  , 
a  2i ) 


(3.28) 


(2)  Recursive  relation  if  P,  has  rank  r<n  and 
[I-P,  P,  ]  a  7^  0_.   This  condition  excludes  the  solution  (3.26), 
which  does  not  satisfy  (3.21b).   Then  from  (3.20)  and  (3.25) 


:     [I-P>k]a         T^ 

-k+1   ■    *k    +£TfI_p+Pk]£    <zk+l    "    *%) 


(3.29a) 


For  notational  convenience  define 


^k+1 


[I-PkPk]a- 
aT[I-P;Pk]a 


(3.30) 


and  note  that  g_,  ,  has  the  following  properties 


T         T 

*  gk+i  -  z.k+1*  =  1 


Pk£Lk+1   =   Pk£k+1   -   0 
T  T         +  T 

ak+ipk  ■  ak+ipk  =  a 


(3.31) 
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The  desired  updating  procedure  for  P,  .  is  found  by  com- 
bining (3.20)  and  (3.25) 


*k+1  =  *k  +  gk+1  (zk+1  -  a  xk) 


T  T         T 

=    pk\£k    -    gk  +  ]_a   PkAkzk    +    zk+1gk+1 

=  tpk  -  ak+1£Tpk]A^zk  +  zk+1gk+1 


Also, 

xi  .  i  =  [Pi  .i]A  z.  +  z,  ,.P,  .a 
— k+1     k+1   — k    k+1  k+1— 

Then  Pfc+1  must  satisfy  the  following  conditions 

(a)  pk+i  ■  pk+i  (3-32a) 

T 
since  A,  ,-A,  , ,  is  a  symetric  matrix. 
k+1  k+1        J 

(b)  Pk+lAk  =  [Pk  "  2k+l^Pk]Ak  (3'32b) 

(c)  Pk+1  a  =  gk+1  (3.32c) 


A  possible  solution  satisfying  the  above  conditions  is 

pk+l  =  Pk-ak+i£Tpk  "  pk£^+i+  (1+£Tpk£)£k+i£k+l 

(3.29b) 
Assuming  symmetry  of  P,  ,  Eq.  (3.29b)  satisfies  (3.32a) 
by  inspection,   Using  (3.22)  and  (3.31),  Eq.  (3.29b)  can 
be  shown  to  satisfy  (3.32b): 

pk+iAk  ■  [pk-ak+iaTpkiAk  "  p^iiAk  +  <-l+*\mk+1sLl+A 
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Since 


and 


T         T  T  +      T 

ak+lAk  =  2k+1PkVAk 


T  T 

Sku-iPi,   =  P.   > 


i-k+1  k 


then 


m  mm 

pi._l-.ai     =    [I-g,  xla   PJA. 
k+1   k  -S-k+l—     k      k 


The    last   condition    (3.32c)    is    also   satisfied  by    (3.29b)    since 
pk+1£  7  pkl-2k+1£TPka-Pka^+1a  +(l+aTPka)2k+l2^+1a 
■   Pk»  "    <iTPk£>SLk+i   "   Pka  +    d+aTPka)£k+1 
Hence 

pk+i*  =  gk+i 

In  order  to  prove  that  (3.29b)  is  indeed  the  correct 

T       + 
and  unique  expression  for  the  pseudo  inverse  P,  ,  =  [A,  , A,  ,  ]  , 

P,  ,  has  to  satisfy  the  defining  equations  for  the  pseudo 

inverse,  (3.5).   Proof  that  the  equations  in  definition  2 

are  satisfied  follows: 

Proof   Using  (3.31)  and  (3.23), 
(1) 

pk+ipk+i=[pk-2k+i^Tpk-  pk£%+i+  d+£TPk£)ak+12k+i]  K+**T] 

=   PkPk   "   ak+1£TPkPk  +   pk^^-2k+i^Tpk^)£T 

"   pk££Lk+lii5-T+    d+£Tpk£)£k+1ak+l^^T 
■   *A    -   *k+l*\K   +   ak+1£T  (3.32a) 
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+  +        tI-pkptl£ST[i-Pkp!] 

Pk+lPk+l    ■    PkPk    +    T.T    Z    pTl  (3'32b> 

a  [I-PkPkla 


Thus  P^+1Pk+i  is  symmetric  and 


+  +    T 

P    P     =  TP    Pi 
k+1  k+1    L  k+1  k+lJ 


™  Pk+lPk+l  =  [Pk+lPk+l]T 


This  follows  since  P,  ,  ,  P,  ,  ,  P^.-jP^.-i  are  symmetric. 


(3)   Using  (3.32b) 


P+      P         P+  P+       TP         P+       1 

k+1   k+1  k+1        k+11   k+1  k+1 J 

[P1+aaT]Jp1  Pn+  + 


aT[I-Pk<]    a 


+  +  T  +  T  + 

=   P.  P.  P.     +    a  a  P.  P.     +    a  a   [I-P.P.  ] 
k   k   k k   k k   k 


+  T 

=   P.     +    aa 
k        


+  +  + 

P         P         p  =    p 

k+1   k+1   k+1  k+1    * 

(4)       Using     (3.32a) 

p      p      p        =   rp      p       ip 

k+1  k+1  k+1         L   k+1   k+1 J    k+1 

=  <\Pt+  9k+ii*ti-P!^Jlj£Pk-^rtl«TPk-Pltasf+1 

+    (l+aTPka)gk+1g£+1] 

=PkPkPk"  gk+l^k+l^Tpk-  PkPkPk^^k+l  +  d^k^^k+l^k+A+l 
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pk  -gk+1£Tpk  -pki£li  +  (i+a\i^+ia? 


k-'^-k+l^k+l 


P   P   P 
k+1  k+1  k+1 


=  P 


k+1  * 


This  concludes  the  proof  and  (3.29b)  is  indeed  the  desired 
updating  procedure  for  the  matrix  P,  . 

To  complete  the  recursive  algorithm  for  the  solution  of 
(3.18)  a  recursive  form  for  the  matrix  R,  =  [I-P  P  ]  has  to 
be  established.   From  (3.32a) 


Pi-l-iP^Li  =  P,P,+  +  glxlaT  [i-p.p*] 
k+1  K+1     k  k    2-k+l—      k  k 


then 


[I-pk+ipk+i]  -  [I-pkpkJ 


ak+1aT  n-Vk1 


and 


Rk+1   "   Rk    "   2k+i    *   Rk 


(3.33) 


Note    that   the   matrix  R.  remains    unchanged   if   the    recursive 
form    (3.28)    is    applicable   because,    in   this    case, 


P        P 
k+1   k+1 


T      -i 
P.  a  a  P. 
k —     k 

T 
1+a  P.  a 
—     k— J 


+  T 

P,     +    a  a 
k        — 


T 

P,  P,     +   P,  a  a     -   = P, 

k  k  k —         lx   T„  k 

1+a  P,  a 
—     k— 


T 

g   Pk»-      p  T 
T PVaa 

1+aVa      K 
—     k— 


P         P 
k+1   k+1 


+  Pka-^  + 

p  p    +  — n-p  pi 

*krk        1-L   Tn         LX   *W    • 
1+a   P.  a 
—     k— 


Since   either 


[I-PkPk]    =    0 


or 


T  +  T 

aMl-P^]    =    01, 
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then 


P    P+    =  P  P+ 
k+1  k+1    k  k 


and 


Rk+1  ~  Rk  ' 


Then  the  complete  recursive  algorithm,  if  P,  has  rank  <  n 
and  [I-Pkp£]a  ^  0,  is  given  by  (3.29a),  (3.30),  (3.29b), 
and  (3.33),  which  are  summarized  as 


£k+l 


R,  a 
k— 

a  R,  a 
—  k— 


x,  .  ,  =  x,  +  g,  ,  ,  (z,  ,  ,  -ax,) 
— k+1    — k    —k+1   k+1    —  — k 


(a) 


(b) 


pk+i  ■  pk-SLk+i^Tpk-pk^+i+<1+iTpk^ak+1gLi  (c) 


Rk 


+  1 


pk  -  ak+i£Tpk 


(d) 


(3.34) 


J 


where  R,  =  [I-P,  P,  ]  is  an  idempotent  matrix,  the  trace  of 
which  is  equal  to  n-r,  where  r  is  the  rank  of  A,  and  n  the 
maximum  possible  rank  of  A,  .   A  computation  flow  diagram  for 
the  recursive  algorithm  (3.28)  and  (3.34)  is  given  in 
Fig.  3.2.   Note  that  if  the  normalized  least-square-error 
solution  is  desired,  only  Eq.  (3.24c)  has  to  be  changed  to 

pk+i  ■  pk "  ak+1a\  -  *k»Skn  +  ^±+*\^3.k+A+i      <3-34e> 

The  complete  algorithm,  Eqs .  (3.28)  and  (3.34),  is  illus- 
trated in  the  following  simple  example. 
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LEAST   SQUARES  FIT    RECURSIVE  ALGORITHM 
COMPUTATION    DIAGRAM 


YES 


YES 


CRIT=gTRkg/qTg 


NO 


USE 

EQUATIONS 
3.34 


RETURN 


NO 


YES 


Jkl^ 


NO 


Xn=(Jk/gTa)g 
Pk  =  oaT/(a_Ta)2 
Rk=  I  -ggT/gTg 

RETURN 


USE 
EQUATIONS 
3.28 


RETURN 


FIG.  3.2 
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Example  3 . 1   Given  the  four  equations  below,  calculate 
the  best  approximate  estimate  for  x  sequentially. 


[1  0]x 

[3  0]x 

[1  l]x 

[2  l]x 


Following  the  computation  diagram  in  Fig.  3.2  the  results 
below  are  obtained 


a) 


*1  = 


k  = 

"l 

0 


Zl  -  X 


P,  = 


1   0 


0   0 


aT  =  [1   0] 


Rl  = 


0   0 
0   1 


trace  R,  =  1 


b) 


k  =  2     z2  =  2 


a1    =  [3   0] 


trace  Rx  =  1 ,   CRIT  =  0,  then  use  (3.28) 


1.3 
0 


*2 

= 

+ 

Pl^ 

T 
1+a  Pa 

P2 

= 

Pl 

- 

T 
P  a  a  P 

T 
1+a  P1a 

To  0 

R2 

Rl 

0   1_ 

race 

R2  . 

1. 

.1   0 
0   0 


79 


c) 


k  =  3     z3  =  3     a  =  [1   1] 


trace  R,  =  1 ,  CRIT  =  .5,   then  use  (3.34) 


R2- 

2-3  =  -Tl 

a  R2a2 


*3  =  *2   +   23  (z3-a  ^2}  = 


1.3 
1.7 


T  T        T        T 

P3  =  P„  -  g_a  P„  -  P2a_cj_3  +  (1+a  P2a )  £32.0 


.1   -.1 
-.1   1.1 


R3  ~  R2  "  £3^  R2  " 


0   0 
0   0 


trace  of  R,  =  0,  thus  all  following  equations  are  processed 
according  to  Eq.  (3.28). 


d) 


k  =  4 


z4  =  4 


a      =  [2   1] 


*4 


P.  = 


£3 + 


P3  " 


Pk^ 


T 
1+a  P.  a 
—  k— 

P3^Tp3 

T 
1+a  P3a 


(z4-a  x3) 


.0952 
-.104 


1.29 
1.67 


-.104 
.714 


The  vector  x.  is  then  the  best  approximate  solution  to  the 
given  set  of  equations. 

The  recursive  algorithm  presented  here  is  more  general  than 
the  one  used  in  Chapter  II,  because  it  includes  a  starting 
procedure.   Regardless  of  the  rank  of  A  the  estimate 
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obtained  is  the  best  approximate  solution  according  to 
definition  1. 

C.   ESTIMATING  THE  STATES  OF  A  LINEAR  DYNAMIC  SYSTEM 

An  alternate  interpretation  for  the  above  recursive 
algorithm  is  that  of  determining  the  constant  state  vector 
x  for  the  following  system  from  a  series  of  noise-contam- 
inated scalar  measurements: 


wh 


System:        ^+1  =  I  xk 


Measurement:   zk+1  =  Mk+1xk+]_  +  vk+]_ 


ere  M,  ,  is  the  time-varying  observation  matrix 


and  Vi  ,  ■>    is  the  measurement  noise. 
k+1 

In  order  to  be  able  to  estimate  the  state  vector  for 

a  dynamic  system,  where  the  transition  matrix  $k+-i  k  is 

in  general  time-varying  and  not  equal  to  the  identity 

matrix,  it  is  desirable  to  develop  an  algorithm  similar  to 

(3.28)  and  (3.34)  for  the  following  systems  and  the  scalar 

measurements  z 


System:        x_k+1  =  \+1  fk^  (3- 35a) 

Measurement:   zl-+i  =  Mk  +  1— k+1  +  Vk+1  (3.35b) 

For  the  case  when  the  system  of  equations  for  the  estimation 
of  x,  is  determined  or  overdetermined  (P,  has  rank  n)  Eqs . 
(3.28)  are  easily  adapted  to  include  the  transition  matrix 
[3].   Let  the  equation 
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*k  =  AA  +  vk  (3-36a> 

be  valid  at  time  instant  k,  then  the  solution  is 

xk  =  ^V'S^k  (3.36b) 

and 

Pk  =  [AkAk]_1  (3.36c) 

Using  the  property  of  the  transition  matrix  that 

[*k,k-irl  -  Vi,k  -  (3-37a> 

Eqs .  (3.36)  at  time  instant  k+1  take  the  following  form 


*k  =  Ak$k,k_l*k+1  (3*37b) 

r,T     ^T,,  .       i-l*T     aT 

*k+i  =  [*k,k+iWk,k+i]    *k,k+i AA 


=  ^t  _i_t  i  xt  (3.  37c) 

k+l,k— k 


T     T         -1 
k+1      k,k+l  k  k  k,k+lJ 


58    $v_li    uPjLi    i  (3.37d) 

k+l,k   k   k+l/k 


Thus   whenever   the   matrix  P,     has    rank   n,    the   valid    recursive 

k 

algorithm  for  the  dynamic  case,  including  the  new  observa- 
tion Zj.+1/  is 
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-k+1 


k+l 


-k+1 


k+l,k  k  k+l,k 
Q 


T 

M    M    O 
k+1  k+1  k+luk+l 


k+1  T 

k+1  k+1  k+1 


(3.38) 


'k+l,k-k  + 


Qk+lMk+l 


1+Mk+lQk+lMk+l  j 


Consider  now  the  case  when  [A,  A,  J  is  singular  so  that  the 
pseudo  inverse 


Pk  =  tAkAk]+ 


(3.39) 


should  be  used.   At  time  instant  k+1 


T       + 
0    =  r$     p   $     1 
yk+l    L  k,k+l  k   k,k+lJ 


(3.40) 


A  comparison  of  (3.39)  and  (3.40)  can  easily  be  made  using 
(3.14a) .   Let 


then 


A,  =  N-M 
k 


+     T    T    T  - 1   T 
A,  =  M  [MA,  A.  M  ]   MA. 
k         k  k        k 


Also  define 


Bk  =  Ak$k,k+1 


then 


+         TT  'ttt  T 

k,k+l       l      k,k  +  l    k    k      k,k+l  k,k+l      k 


Thus  (3.39)  and  (3.40)  may  be  written  as 
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T    +  T  "l 
P   =  M  [MPkM  ]   M 

T      T  +   T        T 

k+1     k,k+l    l    k,k+l   k   k,k+l  k,k+l 


This  reveals  that  there  is  no  simple  relationship  between 
the  matrices  [AkAk]+   and   f\ /k+1AkAk$k /k+1l +- 
Thus  (3.34)  cannot  easily  be  adapted  to  yield  the  best 
approximate  estimate  for  the  state  vector  at  k+1.   However 
an  acceptable  alternative  for  a  starting  procedure  is  to 
estimate  x,  sequentially  using  (3.28)  and  (3.34)  until  P, 
reaches  maximum  rank;  then  (3.38)  may  be  used.   The  inter- 
mediate estimates  when  P,  has  rank  r<n  are  given  by 

*k  =  $k,l*l  • 
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IV.   FINITE  ITERATION  METHODS 
— — — • — — 

In  previous  chapters  methods  for  the  solution  of  Ax=b 
and  for  the  pseudo  inversion  in  recursive  form  have  been 
presented  for  the  solution  of  the  sequential-estimation 
problem.   In  this  chapter,  finite  iteration  methods  for  the 
solution  of  a  set  of  linear  equations  and  for  matrix  pseudo 
inversion  are  presented.   These  methods  are  based  upon  an 
infinite  sequential  error-correcting  scheme,  proposed  by 
J.  Nagumo  and  A.  Noda  [10],  combined  with  the  Gram  -  Schmidt 
process  [9].   The  derived  methods  require  only  a  finite 
number  (equal  to  the  rank  of  A)  of  iterations.   This 
approach  has  also  been  considered  by  L.  D.  Pyle  [23]  and 
some  of  the  results  presented  in  this  chapter  are  similar 
to  his.   For  the  proper  use  of  Pyle's  algorithm  it  is 

necessary  to  rearrange  the  given  set  of  equations  whenever 

T 
the  constant  b,  in  the  first  equation,  a,x  =  b, ,  is  equal 

to  zero.   Since  this  may  not  always  be  convenient  in  practice, 

an  alternate  algorithm  is  presented  in  which  the  computation 

starts  unconditionally.   Section  A  presents  the  basic 

iteration  procedure  with  geometric  interpretation.   In 

Section  B(l)  this  method  is  combined  with  the  Gram  -  Schmidt 

process  resulting  in  a  procedure  for  the  solution  of  a 

consistent  set  of  equations.   These  results  are  extended 

in  Section  B(2)  to  solve  a  set  of  inconsistent  equations 

and  the  solution  is  shown  to  be  identical  to  the  best 

approximate  solution  according  to  Penrose.   An  alternate 
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method  accomplishing  the  same  result  is  then  derived.   In 
Section  C,  the  foregoing  methods  are  extended  to  solve  for 
the  matrix  inverse,  when  it  exists,  and  for  the  pseudo 
inverse.   In  Appendix  A  these  results  are  applied  to  the 
iterative  solution  of  a  set  of  non-linear  equations. 

A.   INFINITE  ITERATION  PROCEDURE 

Consider  the  problem  of  solving  the  following  consistent 
set  of  equations.   The  term  consistent  is  used  to  denote 
the  fact  that  the  system  of  equations  is  assumed  to  have 
either  an  exact  solution  or  a  unique  locus  for  the  solution. 
Alternatively  b  is  contained  within  the  vector  space 
spanned  by  the  column  vectors  of  A. 

Ax  =  b  (4.1) 

T 
in  b.   Then  (3.1)  may  be  written  as 


Let  a.   represent  the  i ' th  row  of  A  and  b.  the  i ' th  element 
— i    *  i 


T      w 

a,  x  =  b, 

T 
a2-   D2  (4.2) 


a  x  =  b 

m—    m 


The  iteration  scheme  proposed  by  J.  Nagumo  and  A.  Noda  [10] 
solves  each  equation  in  (4.2)  successively  for  x  by  adding 
to  each  approximation  for  x  a  correction  of  appropriate 

size  in  the  direction  normal  to  the  hyper-plane  in  x-space 

T 
represented  by  a.  x  =  b..   After  solving  the  m'th  equation 
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the  process  starts  over  again  with  the  first  equation.   Let 
the  i ' th  approximation  for  x  be  denoted  by  x.  and  the 
initial  estimate  for  x  by  x  =0.   Then  the  method  may  be 
presented  as 


x 


i 
i  ,  •   =  x_  ,  „   +  (b,  -a-,  x_  ,  .  ) 
l+jm    — 0+jm      1  — 1— 0+]m 


*1 

T 
^1*1 


— 2  +  jm 


T        —2 

— 1+im      2  — 2— 1+im   T 


*2-2 


x 


V 


-m+ jm 


+  (b  -ax 


a 

— m 


t 


— m-l+jm   m  —m— m-l+jm  T 
J  J      a  a 

— m— m  / 


(4.3) 


Eq.  (4.3)  may  be  derived  as  follows.   Consider  the  i ' th 
equation  in  (4.2)  and  assume  that  the  solution  has  the 
form 


x.  =  x.  ,  +  Ax. 
— l    —l-l    — i 


(4.4) 


where  x.  ,  is  the  solution  for  x  obtained  from  solving  the 

—l-l  —  3 

(i-l)st  equation.  Combining  the   i ' th  equation  of  (4.2) 
with  (4.4)  yields 


T  T 

(b  .  -  g , x,  , )  =  a. Ax, 
l    ^i—i-l     —l  — l 


(4.5) 


which  may  be  solved  for  Ax.  using  the  best  approximate 


—  l 


solution  according  to  Penrose,  namely 


T        -i 
Ax.  =  (b.  -  a.x.  ,)  —7= — 
—l      l    —i—i-l    T 

a.  a. 
—l—i 


(4.6) 


Then  Ax.  is  the  minimum  norm  solution  of  (4.5),  and 


—  l 
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Si  =  2i-i  +  (^i  -  ^   )  -Jt-  (4.7) 

a.  a . 
—l—i 


This  describes  the  equations  in  (4.3)  before  the  iteration 
process  starts  over  again  with  the  first  equation  in  (4.2). 
The  convergence  rate  of  this  iteration  scheme,  although 
quite  rapid  at  first,  decreases  asymptotically  towards  zero 
as  the  solution  is  approached.   The  limit  is  the  exact 
solution  if  the  system  is  determined,  that  is,  if  the  rank 
of  A  is  equal  to  the  number  of  unknown  elements  in  x.   How- 
ever, if  the  system  is  undetermined,  the  minimum-norm 
solution,  as  discussed  in  Chapter  III,  is  approached  since 
each  correction  Ax.  is  in  the  direction  normal  to  the  plane 
described  by  the  i * th  equation  in  (4.2) 

B.   FINITE  ITERATION  PROCEDURE 

1.   Sets  of  Consistent  Equations 

The  foregoing  iteration  scheme  requires  an  infinite 
number  of  steps  to  converge  to  the  solution.   If  the  process 
is  truncated,  only  an  approximation  is  obtained.   As  will  be 
demonstrated  this  difficulty  may  be  remedied  by  constraining 
the  corrections  to  be  orthogonal  to  each  other. 

Again  consider  the  set  of  consistent  equations  (4.2) 
where  each  vector  a.  is  normal  to  the  hyper-plane  described 
by  the  i ' th  equation  in  (4.2).   These  normal  vectors  are 
generally  not  orthogonal  to  each  other.   However,  using  the 
Gram-Schmidt  process  [9],  an  orthogonal  basis  for  the  vector 
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space  spanned  by  the  vectors  a. (i  =  l,...,m)  may  be 

obtained.   According  to  this  procedure  a  sequence  of  vectors 

a.  is  constructed  from  the  set  of  vectors  a.  in  the  follow- 
— 1  —l 

ing  manner: 


*1  =  *1     T 

^2 

-2-2     T    -1 

-1-1  (4.8) 

T 
k-1  a .  a, 
v   —  i— k 
a.  =  a,—  l      —= —  a . 
— k    — k    .  ,   T   — i 
i=l  a . a. 
—l—i 


then  the  set  of  vectors  a-  consists  of  mutually  orthogonal 

—l  T 

%£k 


vectors  only.   Note  that  — —  a.  is  the  component  of  a, 

at  a.  —1  -K 

—l—i 

in  the  direction  of  a.,  which  is  subtracted  from  a,  leaving 

— 1  K 


the  normal  component  to  a. .   In  recursive  formulation  this 

orthogonalization  process  may  be  presented  as  follows. 

Let  {C  ,C, /....C,  }  be  a  sequence  of  n  x  n  matrices  with 
o   1      k         ^ 

C   =1,  then  the  set  of  mutually  orthogonal  base  vectors  a, 
are  obtained  from 


a,  =  C,  na.    if  a,  =  0,  recalculate  a,     using  a.  ,, 

— k    k-1— i  — k    —              — k      3  —l+l 

T 

a,  a,  l  =  1 ,  2  ,  .  .  .  ,  m 

Ck  =  Ck-1     T  k  =  l,2,...,r        (4*9) 
a,  a, 
— k— k 


If  the  process  yields  a  zero  base  vector,  the  corresponding 
vector  a.  is  a  linear  combination  of  the  previously  defined 
base  vectors,  and  the  i ' th  equation  is  a  linear  combination 
of  the  previous  equations.   Therefore  the  i ' th  equation  and 
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the  corresponding  zero  base  vector  may  be  disregarded,  so 
that,  finally,  there  are  only  r  independent  equations  and  r 
orthogonal  base  vectors,  where  r  is  the  rank  of  A  in  (4.1). 
The  exact  solution  for  x  or  the  minimum-norm  solution  for 
x,  if  the  system  of  equations  is  underdetermined,  is 
obtained  as  a  linear  combination  of  these  r  base  vectors. 
Using  the  form  of  (4.7)  a  correction  A_x,  is  made  success- 
ively for  each  of  the  r  mutually  orthogonal  base  vectors 

T        -k 
*k  =  *k-l  +  (bk  "  *k2k-l>  ^T—  (4'10) 

SWk 

It  should  be  noted  that  the  process  (4.10)  terminates  after 
the  r  corrections  are  made.   Thus  the  infinite  iteration 
process  of  (4.3)  essentially  reduces  to  an  r-step  process. 
These  results  are  summarized  in  (4.11) 

a,  =  C,  ..a.         if  a,  =  0,  recalculate  a,  using  a.  .  \ 
— k    k-1— i      — k   —  — k      3  —li  ' 

T 

-k^k 
c      =   c  — 

k    k-1    T 

<*k^k 

ak       T        i=l,2,...,m 

-k  =  -k-1  +  ~f    (bk"-i-k-l}    k  =  1,2,... fr 

a,  a, 

— k— k  / 

(4.11) 
Note  however  that  the  first  equation  which  starts  the  pro- 
cess has  to  have  a  nonzero  element,  b, ,  in  the  vector  b, 
because  if  b,  =  0,  x,  =  0_  which  is  not  correct  generally. 
If  b,  =0  the  process  may  be  started  by  either  choosing 
another  suitable  equation  of  (4.2)  as  the  first  equation 
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or  by  initializing  x   =  a.,  where  a.  x  a.  ¥    0.      That  is, 
J  3  — o    —  j  —j         —i 

a.  is  not  parallel  to  a,.   Eqs .  (4.11)  may  be  rewritten  in 
a  compact  form  as 
7 


~j  k-l-j 


P   =  P 
k    k-1 


T 

k-1— 3— j  k-1 

T 
a ,P,  ,  a  . 
-3  k-1-: 


>  (4.12) 


K.         JL     j    Z.     f     m      •      •      f     10 


where  P   =1  and  the  index  j  denotes  the  succeeding  equation 

T 
in  (4.2)  for  which  a.  P,  a .  ¥■    0.   It  is  interesting  to 

— j   k— j    '  * 

compare  the  form  (4.12)  with  the  form  of  the  recursive 
least-square-error  solution  (2.26).   The  following  example 
illustrates  the  use  of  (4.11). 

Example  4.1:   Use  Eqs.  (4.11)  to  solve  the  following  set  of 
equations  for  x. 


1 

0 

2 

Xl 

1 

1 

1 

0 

X2 

= 

0 

0 

2 

1 

^X3. 

0 

X 

— o 


=  0 


C   =  I 
o 


Step  1: 


fi 


ou  =  C  a,  =  0 
-1     o-l    2 


xx  =  Xq  +  (b1-a1xo) 


a 


T 

a .  a . 
—l—i 


rv 

0 
0 


tti 


r:2] 

o  i 

.4 


c,  =  c  - 

1     o 


T 
a .  a . 
—l—i 

T 
a .  a . 
—i—i 


.8   0   -.4 

0   1     0 

-.4   0    .2 
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Step  2:   a„  =  C,a2 


.8 

1 
-.4 


*2 


£i 


T      ^2 
(b2-a2x1)  ~y— 


.2 

1 
9 

.8 

1 

0 

1 

-1 

.4 

-.4 

4 

Step  3: 


C2 

=    Cl    " 

«2 

T 

^2 

°L2       i 
9 
^2 

"-10/9" 

2.3 

=   C2a3 

+10/9 
+    5/9 

*3 

=   *2   + 

(b 

3"^2) 

4 
-4 
-2 


-4 
4 
2 


-2 
2 

1 


a3a3 


1 

_1 

-.8 

.2 

-1 

.8 

= 

-.2 

4 

9 

•4j 

L..4J 

c„  = 


C2  " 


^3 

T 
1^3 


0   0 

0   0 
0   0 


x-.  is  the  unique  solution  to  the  given  set  of  equations. 
The  iteration  sequence  as  well  as  the  planes  described  by 
the  set  of  equations  are  shown  in  Fig.  4.1. 
2 .   Sets  of  Inconsistent  Equations 

If  the  set  of  equations  (4.1)  is  inconsistent,  as 
is  usually  the  case  in  estimation  problems,  the  vector  b 
is  not  completely  contained  in  the  vector  space  spanned  by 
the  column  vectors  of  A.   A  solution  may  be  obtained  by 
solving  the  equation 


Ax  =  bA 


(4.13) 
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LLUSTRATION    FOR  EXAMPLE   4.1 


FIG.    4 
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where  b   is  the  part  of  b  contained  in  the  column  space  of 
A,  or  the  projection  of  b  into  the  column  space  of  A.   The 
remainder  b^  =  b  -  b   represents  that  portion  of  b  which 
is  ignored  in  the  solution,  and  is  orthogonal  to  the  space 
of  A.   This  represents  an  optimal  choice  for  b,  which  may 
be  shown  as  follows: 

Let  the  vector  b  be  decomposed  into  two  components 


k   =  kA  +  k e  (4.14) 

where  b'  is  contained  in  the  space  of  A  and  bp  is  the  part 
of  b  which  is  ignored  in  the  solution  process.   If  x,  is 
the  solution  of  the  consistent  set  of  equations 


bi    =   Ax, 
—A  — k 


then  llb-Ax,   II   =    lib'    -   Ax,     +   b„\\  =   II  b_ 

ii  _      _k  ii         I'  — A  — k         — E  "         "  — E 


m  T         2 

where        ||  b_E  ||  =      2       (b.    -   ^x^.) 

i=l 


The  minimum  of  ||  b  ||  is  obtained  when  b_  =  b^.   Then  the 
solution  for  x  of  the  inconsistent  set  (4.13)  is,  by 
definition,  the  least-square-error  solution.   The  standard 
way  of  achieving  this  projection  of  b  is  to  premultiply 
(4.2)  by  AT. 

T       T 
A  Ax  =  A  b 

T 
_T,    +  A  b^7 
=  A  b  A      -N 
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but  since 

T 
A  b.T  =  0 

— N    — 

ATAx  =  ATbA  (4.15) 

Since  (4.15)  is  a  set  of  consistent  equations,  it  may  be 
solved  using  (4.11). 

Another  finite-step  process  for  the  solution  of  (4.1) 
which  starts  unconditionally  but  involves  a  little  more 
computation  may  be  derived  following  arguments  similar  to 
those  which  led  to  (4.11).   Again  consider  the  equations  in 
(4.2).   In  order  to  obtain  the  minimum-norm  solution  (if 
the  set  is  underdetermined)  the  desired  solution  may  be 
represented,  as  in  the  previous  section,  as  a  linear  com- 
bination of  the  row  vectors  of  matrix  A,  or,  equivalently , 
as  a  weighted  sum  of  different  base  vectors  representing  the 
same  space.   The  set  of  Eqs .  (4.1),  if  it  is  assumed  to  be 
inconsistent,  may  be  written 


b  =  Ax  +  b^T  (4.16) 

—     —    — JN 


where  again  b   is  orthogonal  to  the  space  of  A. 
Now  let 


b  =  A,Aa,  +  A-Aa~  ■»-...  +  A  Aa  +bM  (4.17) 

—    i.  — _l     z  — z.  r  — r  — in 

where  the  Aa,  vectors  are  constructed  to  be  mutually  per- 
pendicular and  A,  A  a,  is  the  part  of  b  parallel  to  the  vector 
Aa,  .   Thus  the  coefficients  A,  are  determined  from  the 
condition 
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(Aak)   (b  -  akAxR) 


=  0 


(4.18) 


as 


a.  A  b 
— k  — 

T  T 
a.  A  AOL 
— k    — k 


(4.19) 


Substituting  (4.19)  into  (4.17)  yields 


a,  A  b 

*  =   T  T     A«l  +  ' 
a, A  Aa, 


T  T, 
a  A  b 

— r   — 

+   m  _ Aa   +  h  , 

a  A  Aa 
— r   — r 


=  A 


r    a,  A  b 
v    — k   — 

k=l  a,  A  Aa, 
— k    — k 


+  *N 


(4.20) 


Comparison  with  (4.16)  yields  the  desired  solution,  namely. 


T  T 
r    a  A  b 

J5  =   *     nT,     *k  • 
k=l  a,  A  Aa. 

— k    — k 


(4.21) 


The  mutually  perpendicular  vectors  Aa,  are  constructed  using 
the  Gram-Schmidt  process.   Using  (4.9)  with  a,  replaced  by 


Aa,  and  a.  replaced  by  Aa. ,  yields 


Aa.  =  C,  ,    Aa.    if  Aa,  =0  ,  recalculate  Aa.  using  Aa.,r| 
— k    k-1   —l       ~k  —  — k     ^  —l+l 


C,  =  C, 


T  T 
Aa.  a.  A 
— k— k 


I  k     k-1     T_T_ 

a,  A  Aa, 
— k    — k 


i  =  1 ,  2  , .  .  .  ,m 

J\.      —  ±.  f  A  f   *    •  •  f  XT 


(4.22) 

Another  equally  acceptable  set  of  mutually  orthogonal 
base  vectors  for  the  column  space  of  A  may  be  obtained  from 
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the  vectors  Ad . ,  where  the  d, 's  are  the  rows  of  the  matrix 

T 
A  A,  since  the  column  space  of  A  may  be  expressed  using  the 

set  of  all  the  vectors  Ad. ,  as  well  as  the  set  of  all 

vectors  Aa . .   Using  Ad.  instead  of  Aa.  in  (4.22)  requires 

less  computation  if  the  dimensions  of  A  are  such  that 

m»n.   Thus  it  is  possible  to  write,  starting  with  C  =  0. 


A3,  =  C,  _-,Ad.    if  A3,  =0_,  recalculate  A3,  using  Ad.  ,n 


^ 


c  =  c 

k    k-1 


m   m 

AW 

T  T 
BkA  A6k 


i  =  1,2,.. . , n 


(4.23) 
where  3,  replaces  pt,  in  (4.17)  and  (4.21). 

In  order  to  obtain  the  solution  (4.21),  Eq.  (4.23)  is  modi- 
fied to  yield  an  explicit  form  for  the  calculation  of  the 
3j_'S/  which  when  multiplied  by  A  yield  the  orthogonal  base 
vectors  for  the  column  space  of  A. 


ck-i  ■  *  - 


T  T 
T  T 

i£A  A£i 


T    T 

Ae.k-A-iA 

T    T 

£*-iA  A£k-1 


then 


T  T 
k-1   A3.6TA 
—l—i 

=  i  -  y       — - — - — 


T  T 
k-1  A3- 3- A 

A3,  =  [I  -   2   -t^~ ]  Ad. 

K        i=l  3  A  A3- 

— i   — i 


T  T 
k-1  3..3.7A  A 

=  A  [I  -   Z   Hs-i ]  ^-i 

i=l  3,.AiA3, 
— i 
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and 


£k  = 


mm-. 

k-1  g. 3-A  A 

—1—1 

T  _    y         _ ± 

T  T 
i=l  g  A  A3- 

—     — 1-> 


d. 

— 1 


Combining  (4.24),  written  in  recursive  form,  with  (4.21) 

yields  the  desired  finite  step  process  for  the  solution  of 

(4.1) .   Let  C  =  I  then 
o 


3_,  =  C,  _-|d.   if  3i_  -   0,    recalculate  $,  using  d.  ,  \ 


C,  =  C 


k-1 


m   m 

W  A 

m   m 


T  T 

KA  b 

x  =  x    +  — 3 

~k'1        B^ATABk  ~k 


^ 


The  process  is  completed  when  k 
solution. 


i 
k 


r . 


-L  /  ^-  /  •  •  •  /  n 

-L  /  ^-  /  *  •  •  f   1- 


j 

(4.25) 


x   is  the  desired 
— r 


C.   MATRIX  PSEUDO  INVERSION 

The  foregoing  computation  methods  may  be  extended  to 
yield  matrix  inversion  or  pseudo  inversion.   However  no  com- 
putation time  comparison  with  already  existing  methods  has 
been  made.   Consider  the  solution  of  the  matrix  equation 


AX  =  I 


(4.26) 


where  the  matrix  A  is  square  of  dimensions  n  x  n  and  non- 
singular.   The  inverse  of  A  may  be  obtained  as  the  solution 
of  (4.26)   from  (4.12).   Let  P   =  I  and  X  =  AT  then 
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xn  = 


P1  lau 

k-l-k 
Xk-1  +   TD 

-kPk-l-k 


T     T 

(k  "  *k  xk-i} 


p,  = 


pk-i + 


T 
Pk-l-k-kPk-l 
T 
-kPk-l-k 


k  =  1,2, 


)   (4.27) 


,n 


.  T  . 
where  1,  is  the  1 ' th  row  of  the  identity  matrix  and  the 

n ' th  approximation  for  X,  X   is  the  desired  inverse.   The 

T 
initial  condition  X  =  A   is  chosen  to  ensure  that  the 

o 

starting  conditions  for  (4.12)  are  satisfied,  since 

a  ~ ,  a  _. ,  ...,  a   are,  by  virtue  of  the  nonsingularity  of  A, 

not  parallel  to  a, . 

The  pseudo  inverse  may  be  obtained  as  follows. 

Let  C  =  I  and  Xn  =  0 ,  then 
o  0 


r 


^-k  =  c] 


,d.   if  6  =0,  recalculate  6,  using  d. , 
i-l—i     — k  —  — k       — 1+ 


> 


C   =  C 
^k    uk-l 


T 
-k-k      T 


T  T 


V. 


T 

6,  6r 

y   =  y     +    K  K 
Ak    Ak-1    ,T  TAX 
c^A  A^ 


A 


T 


i  =  1 , 2 , . .  .  ,n 


y(4.28) 


T  T 

where  d.  denotes  the  i'th  row  of  A  A,  and  X   is  the  solu= 

tion  for  the  pseudo  inverse  of  A.   Proof  of  (4.28)  follows. 

From  (4.28)  it  is  evident  that 


m   m 

6  A  A5 .  =  0 
-l    -j 


for  i  j-    j 


(4.29) 
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since  the  vectors  A6_.     (i  =  l,2,...,r)  are  orthogonal.   Also 

T 
A  A  may  be  expressed  as  a  matrix  whose  rows  are  linear  com- 


binations of  the  set  of  vectors  6 . .   Thus 


— 1 


A  A  =  [6_x|  6.2|  .  .  .  \§_   ]    M 


(4.30) 


The  expression  for  C   from  (4.28)  may  be  rewritten  as 


C   = 
r 


mm-, 

r   6.67A  A 
I  -  i     zizi 

-1-        u  m   m 

i=l  6.A  A6 . 
—i    —l  -J 


=  I  - 


T   "1 
6.6. 
—l—i 


mm 

i=l  6  7A  A6. 
—l    —l 


T 
A  A 


(4.31) 


where  the  summation  is  identified  as  Y 


Y  = 

r 

E 

i=l 

T 

T    T 
6.A   A6. 
—l         — l 

so 

that 

C 

r 

=    I    - 

T 
-    YA   A 

(4.32) 


(4.33) 


Using  (4.29)  it  follows  that 


C  6.  =  0 
r— l    — 


(4.34) 


and  thus 


and 


C  ATA  =  0 

r 


C  Y  =  0 
r 


(4.35) 
(4.36a! 
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or       Y  =  YATAY.  (4.36b) 

T 
Using  (4.36b)  to  expand  the  product  A  AY  yields 

T       T    T 
A  AY  =  A  AYA  AY 

or       [ATA  -  ATAYATA]Y  =  0.  (4.37a) 


Since    Y  ^  0  this  yields 

ATA  =  ATAY  ATA  (4.37b) 


From  (4.35) 


T       T    T 
A  A  -  YA  A  A  A  =  0 


and  then  also   [ATA  -  YATA  ATA]Y  =  0 


This  is  combined  with  (4.37a)  to  yield 

[ATAY  -  YATA]  ATAY  =  0.  (4.38a) 

Since  ATAY  ^  0 ,  then 

ATAY  =  YATA.  (4.38b) 


Eqs.  (4.36b),  (4.37b)  and  (4.38b)  by  definition  establish  Y 

T 
as  the  pseudo  inverse  of  A  A: 

Y«[ATA]+  (4.39) 


Comparing  (4.32)  with  the  last  equation  in  (4.28)  when 
k  =  r  yields 

Xr  =  YAT  =  [ATA]+AT  =  A+  (4.40) 
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Penrose  [5]  has  also  suggested  a  recursive  method  for 
computing  the  pseudo-inverse.   Let 


then 


Cl  =  X 


Ck+1  =  l    *  k   trace  (CkATA)  "  ckATA  (4.41) 


T  .  T 

The  product  C   ,  A  A  =  0,  where  r  is  the  rank  of  A  A, 

T 
then  the  pseudo  inverse  of  A  A  is 


[ATA]+  =  £ —  Cr  (4.42) 

trace  C  A  A 

The  proof  is  given  in  Ref.  5.   It  should  be  noted  that  the 

Penrose  method  involves  at  least  one  matrix  multiplication 

3  3 

for  each  step,  or  approximately  rn   operations,  where  n 

operations  are  required  to  perform  the  multiplication  of 

two  n  x  n  matrices,  whereas  the  method  of  (4.2  8)  requires 

2 
approximately  5rn   operations  to  obtain  the  same  result, 

T   + 
namely  [A  A]  . 
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V.   RECURSIVE  ALGORITHM  FOR  THE 
SLIDING-WINDOW  OBSERVER 


Since  the  work  of  Luenberger  [11] ,  deterministic  linear 
observation  systems,  called  observers,  have  been  recognized 
as  practical  alternatives  to  statistical  optimum  linear 
filters  when  efficient  and  fast  real-time  estimation  of 
the  system  states  is  desired.   Avoiding  problems  associated 
with  the  estimation  of  a  priori  statistics,  the  observer 
simply  solves  the  estimation  problem  as  a  deterministic  one 
and  disregards  statistical  quantities.   The  simplest  formu- 
lation for  the  observer  is  the  sampled-data  type  which 
accepts  the  measurements  or  observations  only  at  discrete 
points  in  time. 

Consider  the  sampled-data  system 


*k  =  *k,k-l  *k-l 


z,  =  M,  x, 
k     k  — k 


(5.1) 


where  x,  is  the  system  state  vector  at  time  instant  k, 


,  ,  is  the  general  time-varying  transition  matrix  from 


time  (k-l)T  to  kT.   M,  is  the  time-varying  observation 


matrix  of  dimension  1  x  n,  and  z,  is  the  scalar  observation 
As  in  other  chapters,  only  the  case  of  scalar  observations 
is  considered  here  in  order  to  obtain  results  without  time 
consuming  matrix  inversions.   The  observer  for  the  system 
(5.1)  is  given  as 
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x,  =  $,  .  ,  x,  ,  +  g,  ( z,  -M,  $,  .  ,  x,  ,  )         (5.2) 
— k    k,k-l— k-1   — k   k   k  k,k-l— k-1  \~"*» 

where  x,  is  the  estimate  at  time  kT. 

This  observer  equation  may  be  rewritten  as 

*k   =   Fk*k-1   +   2k^k  (5.3a) 

where 

Fk  -  [I  -  fcWWi  (5-3b) 

The  matrix  F,  is  called  the  observer  transition  matrix. 
Bona  [12]  has  shown  that  the  eigenvalues  of  F,  ,  which  are 
dependent  on  the  choice  of  g,  ,    determine  the  performance 
of  the  observer  in  processing  noise-contaminated  observa- 
tions.  The  time-varying  gain,  g,  ,    for  the  optimal  filter 
is  determined  such  that  the  trace  of  the  error  covariance 
matrix  is  minimized.   For  a  specific  time-invariant 
observable  system  [3],  Bona  [12]  has  demonstrated  that 
constant  gain  observer  with  eigenvalues  of  approximately 
0.5  approach  the  performance  of  the  Kalman  filter.   As  an 
observer  design  rule  for  time -invariant  systems,  he  suggests 
the  choice  of  the  largest  eigenvalue  A   of  F  so  that 
(XT )   is  approximately  zero,  with  the  result  that  (F)   is 
approximately  zero  thereby  limiting  the  memory  of  the 
observer  to  approximately  the  last  i   observations.   Because 
of  the  size  limitations  of  the  memory,  this  type  of  observer 
is  also  referred  to  as  a  sliding-wihdow  observer. 
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In  the  following  work,  stimulated  by  a  list  of  un- 
solved observer  problems  [13] ,  a  recursive  relationship 
for  the  sliding-window  of  minimal  length  for  the  general 
system  of  Eq.  (5.1)  is  developed  in  Section  A.   In  Section 
B  these  results  are  extended  for  the  time-invariant  system 
to  yield  the  recursive  form  for  a  sliding-window  of  speci- 
fied length.   The  approach  to  the  solution  is  quite 
different  from  the  one  discussed  by  Bona,  resulting  in  a 
new  design  rule  for  sliding-window  observer  of  exact 
specified  memory  length.   Finally,  the  results  obtained  are 
illustrated  by  an  example. 


A.   THE  MINIMUM-WINDOW  OBSERVER 

The  minimum-window  observer  is  the  fastest  linear 
observer  possible  in  that  it  determines  the  states  of  a 
system  from  the  necessary  minimum  number  of  observations. 
The  eigenvalues  of  the  observer  transition  matrix  are  all 
zero  [12] .   The  desired  recursive  form  is  derived  as  follows 
Consider  system  (5.1)  and  its  observer  equation  (5.3). 
Suppose  the  system  is  of  order  n.   At  each  instant  of  time 
k  the  state  vector,  x,  ,  is  determined  from  the  last  n 


observations .   Thus 


Jk-n+l 


'k-n+2 


Jk-1 


Mk-n+l  $k-n+l,k 
Mk-n+l  \-n+2,k 


M 


k-1 


k-l,k 


M, 


*k 


(5.4a) 
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or 


z^  =  R(k)  xk 


(5.4b) 


where  the  matrix  R(k)  relates  the  n  last  observations  z,  to 
the  state  vector  x,  .   The  observability  condition  for  this 
system  is  then  that  R(k)  is  nonsingular  for  all  k.   Because 
of  this  it  is  not  necessary  that  each  of  the  rows  of  R(k) 
be  the  product  of  an  observable  pair.   The  solution  to 
(5.4b)  is  then  trivial 


xk  =  R(k)  1zk 


(5.5) 


where  x,  denotes  the  output  of  the  observer.   Let  the 
matrix  R   (k)  =  C(k)  be  partitioned  into  column  vectors 


C(k)  = 


c,  (k)  I  c0  (k)  I  .  .  .  I  c  (k) 
— 1    i  —z         i     i  — n 


(5.6) 


Then    (5.5)    may  be  written    as 


xk   =    c1(k)zk_n+1+c2(k)zk_n+2+...+cn^(k)zk_1+cn(k)zk    (5.7) 


Expanding   Eq.     (5.3)    yields 


x,     =    F,  x,     ,    +   g,  z, 
— k  k— k-1         -k   k 


"     FVFV-1^_9     +     Fviv-lZk-l      +     ik2 


k   k 


FkFk-l' * ,Fk-n+l-k-n+l    +    FkFk-l* * ' Fk-n+2^k-n+lZk-n+l 

+   FkFk-l' * ,Fk-n+3^k-n+2  *   zk-n+2 


+   Fk2k-lzk-l 
+^k    zk 
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(5.8) 


Comparing  (5.7)  and  (5.8)  on  a  term  by  term  basis  leads  to 
the  following  conclusions 


(1)  FkFk-l"-Fk-n+l  -  ° 


(2)  g   =  C  (k) 
— n    — n 


(3)  c_.(k)  =  Fkcj+1(k-l)    j  =  1,2,. ..,n-: 


(5.9a) 
(5.9b) 
(5.9c) 


so  that 


(k) 


F.  c,(k-l)  |  ...  iF,  c  (k-1)  iC  (k) 
k— 2  k— n       n 


(5.9d) 


The  recursive  relation  for  the  gain  vector  g_,  =  c  (k)  may 
be  determined  from  (5.9a)  using  (5.9c)  repeatedly.   Thus 


F,  [F,  n...F,   ^,g,   ]  =  F.  c,  (k-1)  =  0 
k   k-1     k-n+1— k-n      k  — 1 


(5.10) 


since 


Fk  ■  [I  -  2A]  *k-l,k 


(5.10)  may  be  rewritten  as 


$ 


k-l#kHl(k-1)  =  2k\  *k_l,k  £i(k-D 


(5.11) 


The  desired  result  is  then  obtained  directly  from  (5.11) 

Sk  -  £n(k)  =  *k-l,k  «**«  t"k  *k-l,k  ^(k-D]"1     (5.12a: 

Note  that  for  scalar  observation  Eq.  (5.12)  may  be  also 
written  as 


2k 


Vk-i,k£i|k'1J 


(5.12b) 
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For  scalar  observation  the  desired  algorithm  is  therefore 
given  by  Eqs.  (5.12b),  (5.9b),  (5.9c)  and  (5.3).  This  is 
summarized  in  (5.13): 


r 


{ 


ak  ■ 


Vi,^!**-1* 


A 


Vk-i,kSi(*-i) 


k  =  [I  -  gA]  % 


c  .  (k)  =  Fkc  .  +  1(k-l)  ,  j  =  l,2,...,n-l;  £n(k)=g_k 


(5.13) 


x,  =  F,  x,  ,  +  g.  zn 
\  — k    k  — k-1   ^-k   ] 


J 


This  completes  the  derivation  of  the  recursive  algorithm  for 
the  minimum-window  observer.   The  computation  time  required 
is  almost  equal  to  the  computation  time  required  for  the 
corresponding  optimal  filter  (3.38).   For  linear  time- 
invariant  systems  with  a  time-invariant  observation  matrix, 
the  observer  gain  and  transition  matrix  are  constant. 
Thus  (5.13)  reduces  to  the  observer  equation 


x,  =  Fx,  .  +  gz, 
— k    —k-1    —  k 


(5.14) 


and  the  required  computation  time  is  reduced  substantially. 
Although  the  minimum-window  observer  is  highly  sensitive 
to  measurement  noise,  practically  excluding  the  direct  use 
of  (5.14)  for  non-deterministic  problems,  it  is  nevertheless 
possible  to  construct  acceptable  filter  schemes  from  (5.13) 
or  (5.14)  for  special  applications.   Consider,  for  example, 
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an  oscillatory  time-invariant  system  given  by  Eq.  (5.1)  with 
zero  damping.  Then  a  simple  averager  following  the  minimum- 
window  observer  is  characterized  by 


x,  =  Fx,  ,  +  gz, 

— k  — k-1    -2-  k 

*  1   *   *k-i-     k-1  * 

x,  =  7-      Z   $    x.  =  — -, —  $x 

-K  K  i=1       -1 


1  - 

'-k-1    k  -k 


(5.15) 


where   x,  ,  the  output  of  the  combined  filter,  results  in  a 
filter  performance  almost  indistinguishable  from  the  optimum 
filter  (3.38).   This  is  demonstrated  in  Example  5.1.   Note 
that  the  computational  requirements  are  only  a  fraction  of 
the  computational  requirements  for  the  optimal  filter.   It 
is  the  author's  opinion  that  this  result  merits  further  in- 
vestigation in  order  to  find  some  design  rules  for  simple 
and  fast  observers  with  almost  optimal  performance  similar 
to  the  one  given  in  Eq.  (5.15). 

Example  5.1 

Let  the  system  equations  (5.1)  be  given  as 

\ 


*k  =  $*k-l 


.9   -.5 
.38   .9 


*k-l 


\   =  \xk  +  vk  =[1   0]xk  +  V] 


(5.16) 


where  the  observations  are  now  noise- contaminated  with  the 
measurement  noise  v,  ,  a  noise  sample  drawn  from  a  uniform 
distributed  noise  population  of  zero  mean  with  maximum 
deviation  of  ±0.1.   Estimate  the  system  state  vector  x_k 
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using  (5.15)  when  the  system  starts  with  x  = 

— o 


1 
.1 


Results  for  a  typical  noise  sequence  {v-,  ,v?  .  .  .  v,  }  are  shown 
for  both  state  variables  xl,  and  x2,  in  Figs.  5.1  and  5.2. 
Note  that  the  filter  response  closely  matches  the  actual 
system  response.   The  estimation  error  defined  by 


-k  "  -k   -k 


(5.17) 


is  compared  with  the  estimation  error  of  the  least-square- 
filter  in  Figs.  5.3  and  5.4. 


B.   SLIDING-WINDOW  OBSERVER  FOR  TIME- INVARIANT  SYSTEMS 

For  the  case  of  time-invariant  observable  systems  with 
a  time-invariant  observation  matrix  it  is  possible  to  in- 
crease the  memory  of  the  observer  to  an  arbitrary  length  I 
where  &  >_  n.  This  results  in  a  sliding-window  observer  of 
length  I,  where  the  state  vector  x,  is  determined  from  the 
last  I  observations  in  the  least-square-error  sense.  Con- 
sider the  first  set  of  I   observations  according  to  (5.1). 


ft-1 


-I 


M$ 

• 

M 


-£+1 


-U2 


-1 


*£ 


(5.18) 


or 


2£  "  A  x£, 
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FILTER  RESPONSE   FCH   STRTE  UfiRIRBLE  XI 


B.pJ  |       10.  DO 


FIG.  5.1    EX.51 


+  UltjP.I    SYSTEM  RESPOMSf 

FILTtfl  RESPBK3E     EQ.    f4. 153 


FILTER  RESPONSE  FCR  STRTE  UfiRIRBLE  X2 


<.J 


FIG.  5.2    EX.  5.1 


+  fiCTUPL  SVSfEM  RESPONSE 
FILTER  HESPCN3E     EQ-    (4.15) 
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where  z*  is  the  observation  vector  of  the  last  i   observa- 
tion, Xj  is  the  n  dimensional  state  vector  at  time  instant 
£,  and  A  is  the  £  x  n  matrix  of  constants  relating  observa- 
tions and  state  vector  x„ .   Since  the  system  is  assumed  to 
be  observable  matrix  A  has  rank  r  =  n.   Then  the  solution 
to  (5.18)  is  given  by 


x  =  PA  z» 


T   -1 


(5.19) 


where 

Eq.  (5.21)  may  be  written  more  explicitly  as 


X£  =  P{  {$~i  +  1)Vz      +  ($"£"f2)TMTz2  +..  .  +  ($"1)TMTz£_1+MTz£} 

(5.20) 
At  time  instant   £+1  Eq.  (5.18)  takes  the  form  of  Eq.  (5.21) 
since  the  memory  of  the  filter  is  limited  to  £  observations. 


"  A  -£+1 


(5.21) 


*£+l 

Then  analogous  to  (5.20),  the  solution  of  (5.21)  may  be 
written  as 

X.+1  =  P(($     )  M  z„  +  («     )  M  z2+...  +  ($   )  M  z  +M  Ze+]_J 

To  obtain  a  recursive  algorithm  it  is  necessary  to  combine 
(5.22)  and  (5.20).   Thus 

xz    -  P($"£+1)TMTZl  =  P$T{($"£+1)TMTz2+. ..+($"1)TMTz}  (5.23) 
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Substituting  z,  =  M$     X . ,  which  is  true  for  noiseless 

—  1  T  —1 
observations,  and  premultiplying  (5.24)  with  P($   )  P 

yields 

P{  ($"£  +  1)TMTz2+. . .+  ($_1)TMTz£} 

=  P ($   ){P    -  ($     )MM$     }x 

=  P{($_1)T  ATA$_1  -  ($"£)TMTM$"£}$x£ 

=  Pl($   )   [  ($     )  M  M$      +  ($     )MM$      +... 
+  MTM]0_1  -  ($"£)TMTM$"£}  <3>x£ 

=  P{(<1>     )  M  M$     +...+$   M  M$   }$£o 

=  P{A  A  -  MM}  $x£ 

=  [I  -  (PMT)M]  $x£  (5.24) 

T 
Identifying   g  =  PM   (5.24)  may  be  written 

P{ ($"£+1)TMTz2+. . .+($_1)TMTz£}  =  [I-gM]$x£   (5.25) 

It  is  interesting  to  note  that  g  is  given  by  the  last  column 
of  A  ,  as  a  generalization  of  (5.9b)  of  the  previous  section. 
Substituting  (5.25)  into  (5.22)  yields  the  desired  recursive 
relationship 


wh 


ere  the  estimate  x   ,  is  no  longer  dependent  on  the  ob- 


servation z-.  .   It  then  follows  that  the  next  estimate 
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xz+2   =  [I-gM]  $x£+i  +  2£+2  (5.27) 

is  no  longer  dependent  upon  the  first  two  observations  z, 
and  z„.   Therefore  the  general  recursive  formulation  for  all 
discrete  times  k  >  I    is  given  by 

k-1   =   F*k  +  azk+l  (5.28) 

where  F  =  [I-gM]$  and  g  remain  constant.   The  complete 
algorithm  for  the  general  rectangular  sliding-window  ob- 
server, with  the  first  state  estimation  available  after 
£  observations  are  processed,  is  then 


^  T  -1       T 

x£    =     [A   A]       A      z£  k    =    I 


X-k+l    =    Fxk    +    2Zk+l  k    *    l 


(5.29) 


The  estimation  error 

xk  =  xk  -  xk  (5.30) 

obeys   the  same  dynamic  relation  as  x,  .   Thus 

*k+i  =  F*k  +  avk+1  (5.3D 

If  the  covariance  matrix  of  x,  is  denoted  by  C,  it  follows 
from  (5.31)  that 

ck+i  =  *k+i*k+i  =  FCkpT  +  *RgT  (5-32) 

where  R  is  the  variance  of  the  measurement  noise.   Let 


£k  =  |Vck(l,D   |  +  |Vck(2#2) 
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where  the  sequence  of  matrices,  C,  ,  is  obtained  from  (5.32), 
R  =  1,  and  C  (i,i)  denotes  the  i    diagonal  element  of 
the  covariance  matrix.  e,     is  a  relative  measure  of  the 
expected  absolute  estimation  error  at  time  instant  k.   A 
comparison  in  terms  of  this  relative  absolute  estimation 
error  for  a  few  sliding-windows  of  the  system  in  Example  5.1 
with  the  corresponding  error  of  the  least-square-error 
filter  is  shown  in  Fig.  5.5. 
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VI.   SUMMARY  AND  RECOMMENDATIONS  FOR  FURTHER  STUDY 

(1)  The  normalized  least-square-error  solution  of  a  set 
of  inconsistent  linear  equatons  has  been  established  as 
an  alternative  to  the  usual  least-square-error  solution. 
The  solution  is  obtained  by  minimizing  a  weighted  least- 
square-error  criterion  and  presented  in  recursive  form 
for  obtaining  sequentially,  the  solution  of  a  growing  set 
of  equations.   The  technique  is  illustrated  by  some 
estimation  and  identification  problems.   Further  investi- 
gation is  required  to  establish  a  decision  criterion  to 
determine  whether  the  normalized  least-square-error  or 
the  least-square-error  solution  method  is  to  be  preferred 
in  engineering  problems.   It  is  expected  that  this  decision 
criterion,  in  the  case  of  problems  involving  discrete 
state  estimation  or  parameter  identification,  will  depend 
not  only  upon  the  system  equations  themselves  but  also  upon 
the  nature  of  the  measurement  noise. 

(2)  Complete  recursive  algorithms  for  the  normalized  least- 
square-error  solution  and  the  least-square-error  solution 
based  upon  the  concept  of  the  best  approximate  solution  [5] 
are  presented. 

(3)  A  finite  step  algorithm  for  the  calculation  of  the 
pseudo  inverse  and  the  best  approximate  solution  of  a  fixed 
set  of  linear  equations  is  proposed.   This  result  has 
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advantage  over  previously  published  recursive  computa- 
tion methods  in  that  the  algorithm  presented  starts 
unconditionally . 

(4)  Finite-memory,  linear,  observation  filters  in  recur- 
sive form  are  proposed  for  the  sequential  state  determin- 
ation of  a  sampled-data  system.   Design  rules  for  these 
observation  filters,  when  the  system  is  time-invariant, 
are  given: 

(a)  for  the  minimum-window  and  averager  observer  where 
the  system  states  as  determined  from  the  minimum  set  of 
past  data,  are  smoothed  by  a  simple  averager.   This 
procedure  is  shown  to  be  very  efficient  for  an  oscillatory 
system  with  zero  damping.   Further  investigation  is  recom- 
mended to  improve  the  estimation  for  general  systems,  using 
the  minimum-window  observer  together  with  a  weighted 
averager.   It  is  expected  that  optimal  weighting  can  be 
approached  using  a  scalar  weighting  factor  in  the  recursive 
form. 

(b)  for  the  sliding-winding  observer  of  memory  length 
£,  I    >    n,  where  the  n  states  of  the  system  are  determined 
from  the  last  I   measurements  in  the  least-square-error  sense. 

(5)  In  the  Appendix,  the  recursive  algorithm  of  the  best 
approximate  solution  is  applied  to  the  numerical  solution 
of  a  set  of  nonlinear  equations.   The  results  are  promising 
in  that  solutions  are  obtained  when  other  well  known  linear- 
ization or  gradient  techniques  fail.   The  theory  behind 
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this  approach  has  not  been  investigated  fully  and  it  is 
recommended  that  further  mathematical  work  be  persued  to 
establish  rigorous  proofs  and  conditions  of  convergence. 
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APPENDIX  A 

A.   ITERATIVE  SOLUTION  OF  A  SET  OF  NONLINEAR  EQUATIONS 

The  solution  of  simultaneous  nonlinear  equations  is 
of ten  impossible  to  obtain  analytically,  and  graphical  or 
iterative  methods  for  a  computer  solution  have  to  be  em- 
ployed.  In  addition,  the  solution  of  a  class  of  nonlinear 
differential  equations,  as  discussed  in  Section  B,  reduces, 
after  integration,  to  the  solution  of  nonlinear  algebraic 
equations  at  each  step  of  the  integration  process  [16]. 
The  most  commonly  used  iterative  methods  are  based  upon 
linearization  techniques,  i.e.,  Newton-Raphson  method  and 
linear  interpolation,  or  upon  some  gradient  method  whereby 
the  iterative  approximation  to  the  solution  is  sequentially 
improved  such  that  some  error  measure  is  forced  to  decrease 
[16],   All  these  methods  however  may  not  converge,  or  they 
may  converge  to  a  solution  at  infinity.   In  addition,  the 
values  of  the  functions  as  well  as  their  gradients  may 
have  to  be  calculated  at  each  iterative  approximation. 
This  calculation  might  be  impossible  if  the  approximation 
is  outside  the  range  or  domain  of  one  or  more  of  the 
functions,  or  if  one  or  more  of  the  functions  has  a  dis- 
continuity at  that  particular  approximation  point  for  the 
solution.   Further  complications  arise  from  the  fact  that 
the  set  of  nonlinear  equations  may  have  more  than  one 
solution  and  the  above  iteration  schemes  may  converge  (if 
they  converge  at  all)  to  an  undesirable  solution  point. 
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Using  the  solution  methods  for  a  set  of  linear  equa- 
tions, as  discussed  in  Chapter  IV,  a  new  iterative  pro- 
cedure is  developed.   This  procedure  circumvents  the  problems 
and  drawbacks  of  the  foregoing  methods,  and  converges  to  a 
single  point  if  one  or  more  solution  points  exist.   If  this 
point  is  no  solution  point  the  initial  estimate  has  to  be 
displaced  in  the  direction  of  the  preferred  solution  point, 
where  the  preferred  or  desired  solution  is  defined  (in 
accordance  with  the  concept  of  the  minimum-norm  solution) 
as  the  solution  which  is  closest  to  the  initial  estimate. 
The  class  of  nonlinear  functions  included  in  the  iteration 
process  in  general  are  all  functions  which  can  be  represen- 
ted in  polynomial  form  or  which  have  power-series  expansions. 

1 .   Development  of  the  Method 

Let  a  set  of  n  nonlinear  functions  in  n  unknowns  be 
given  as 


h(x)  = 


h1(x) 
h2(x) 


h  (x) 
n  — 


=  0 


(A.l) 


Using  the  polynomial  form,  or  a  power-series  representation, 
(A.l)  may  be  written  as 


Ax  +  Bg (x)  =  c 


(A. 2) 


where  A  and  B  are  constant  matrices  of  dimensions  n  x  n  and 
n  x  m  respectively.   The  elements  of  the  n  x  I  vector 
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T 
c  =  [c,  ...  c  ]   are  the  negatives  of  the  constant  terms 

in  the  polynomial  or  series  form.   The  vector 

T 
g(x)  =.  {g,  (x)  .  .  .  9  (x)  J   has  dimensions  mxl.   Its  elements, 

the  functions  g.<X)  are  the  nonlinear  remainder  terms  of  the 

corresponding  function  h. (x) ,  or  parts  thereof,  chosen  such 

that  the  functions  g. (x)  are  defined  for  all  x. 

Range-or-domain-limited  functions  g. (x)  can  be  accepted 

only  if  it  is  possible  to  reformulate  these  functions  in 

terms  of  variables  for  which  they  are  always  defined.   As 

an  example  consider  the  equation 


y  =  £nx  =  0  (A. 3a) 

which   has    no   real    solution  for   x        0.      However   the   same 
equation   may   be  written    as 

ey    -    x  =    0  (A. 3b) 


or 


2  3 

U   +   y   +   yrr+yn-  +    ...}    -    x  =    0  (A. 3c) 


2! 


or 


[-i  +    1]    fxl    +    [1]    gn  (y)    +1   =    0  (A. 3d) 


i  +    1]    Txl    +    [1]    gi(y)    +1   =    0 


where 


2           3 
gl(y)    =    2T  +    3T  +    (A.3e) 

Now   g, (y)    is    defined    for    all   y    and    the    domain-limited   Eq. 
(A. 3a)    is    acceptable    in    the    iteration   process    in    the    form 
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of  Eq.  (A. 3d).   This  separation  of  the  nonlinear  part  of  h. 
from  the  linear  and  constant  part  has  the  advantage  that 
the  discontinuities  of  h .  do  not  appear  in  g . .   Thus,  for 
example,  if 


2  yx  +  x  -  1  =  0 


(A. 4a) 


or 


[1   0] 


[2]  xy 


(A. 4b) 


y  is  not  defined  for  x  =  0.   However  the  function 


g2(x,y)  =  x.y 


(A. 4c) 


is  defined  for  all  x  and  y 
Let 


x  =  x   +  Ax 
—   -o    — 


(A. 5) 


where  x  is  the  desired  solution,  x   is  the  initial  estimate 
—  — o 

for  x,  and  Ax  is  the  necessary  correction  to  x   in  order  to 

—       —  J  -o 

satisfy  Eq.  (A.l).   Eq.  (A. 2)  may  be  rewritten  as 


AAx  +  B[g(x  +Ax)-g(x  ) ]  =  c  -  Ax   -  Bg(x 
—      —  — o   —   —  — n      —     — -o     —  — 


— O 


(A. 6a) 


or 


[A  +  BD  (x  ,  Ax) ]  Ax 
— o   —    — 


(A. 6b) 


where  c1  is  a  vector  of  constants  defined  as  the  right  side 
of  Eq.  (A. 6a)  and  the  elements  of  the  m  x  m  matrix  D  are 
defined  from  the  total  difference  quotients  of  the  functions 
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g. (x) ,  (i  =  l,2f..,,m).   Thus 

[D(x  ,Ax)]  Ax  =  g(x   +  Ax)  -  g  (x  )  (A. 6c) 

— o  —    —   —  — o    —    ■•■  — o 

This  may  be  best  explained  with  an  example.   Again  consider 
Eq.  (A. 4c) .   Then 


g2(xQ  +  Ax,  yQ  +  Ay)  -  g2(xQ,  yQ) 


=  (x   +  Ax)  (y   +  Ay)  -  x  y 
o        2  o  2  oJ  o 


=  *QAy  +  YQ  Ax  +  AxAy  (A. 7a) 


which  may  be  written  as 


&.*■¥     v¥J[-j 


(A. 7b) 


^    3nH  fv   +  Ax- 


Then  the  terms  (y   +  -^-)  and  (x  +  —*-)    are  elements  of  the 

matrix  D.   In  general  all  the  elements  in  the  matrix  D 

are  dependent  on  the  initial  estimate,  which  is  a  constant 

vector  during  the  iteration  process,  and  the  value  of  Ax, 

which  is  unknown.   The  algorithm  proposed  sequentially 

updates  an  estimate  for  the  elements  of  D(x  ,Ax)  until  the 

true  values  are  obtained.   Then  the  last  approximation  for 

Ax  is  the  desired  correction  for  the  solution  given  in 

(A. 5).   From  (A.l),  and  the  given  initial  estimate  x  , 

— o 

calculate  A,  B  and  c'  as  defined  above.   Let  Ax,  denote 
'        —  — k 

the  k    approximation  for  Ax  and  set  Ax  =0.   Then  solve 
iteratively . 


[A  +  BD  (x  ,  Ax,  )  ]  Axlxl  =  c'  (A.  8) 

— o   — k    — k+1    — 
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Using  (4.25)  for  Ax,  +1  until  (A. 5)  satisfies  all  equations 
in  (A.l).   The  solution  is  obtained  from  (4.25)  because  the 
matrix  in  (A. 8)  may  become  singular  at  some  step  in  the 
iteration  process.   This  describes  the  basic  technique, 
whereby  the  solution  is  obtianed  by  iterative  approximation 
of  the  exact  difference  quotients  of  the  functions  g. (x) 
and  not  by  iterative  improvements  of  a  previous  approxi- 
mation to  the  solution.   If  the  process  (A. 8)  converges  and 
the  error 

n 
e  =   E   |h. I  (A. 9) 

i=l.  -1' 

is  acceptably  small,  the  solution  (A. 5)  usually  is  the 
desired  solution  closest  to  the  initial  estimate.   However, 
if  process  (A. 8)  converges  to  a  value  of  Ax  for  which e  is 
not  small,  then  x  =  x  +  Ax  lies  between  two  or  more 
solutions  of  (A.l).   In  this  case  a  displacement  of  the 
initial  estimate  is  necessary  and  the  iteration  has  to  be 
repeated.   As  shown  in  subsequent  examples  the  values  for 
the  elements  of  Ax,  oscillate  in  a  damped  manner  about 
their  exact  value.   In  order  to  increase  the  rate  of  con- 
vergence and,  more  importantly,  to  force  convergence  if  the 
damping  of  these  oscillations  is  slightly  negative  (which 
would  eventually  lead  to  divergence)  additional  damping  may 
be  introduced.   This  is  accomplished  by  using  a  weighted 
average  between  Ax,  ,  and  Ax,  for  the  computation  of  the 

K  —  I  K. 

elements  of  D (x  ,  Ax,  ).   Then  solve  iteratively  starting 
-o   — k 


with  Ax   =  0  using  (4.25) 
-o  3 
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/ 

[A  +  BD(xo,Ax,]  Ax,  ,  =  c' 


< 


(A. 10) 


Ax,  ,  ,  =  ax,  +  (1-a)  Ax,  ,  , 
— k+1     — k  — k+1 


where  a,  0  <_  a  <  1 ,  is  the  coefficient  of  additional  damping, 
a  is  determined  from  the  rate  of  convergence  of  the  process 
(A. 10)  starting  with  a  =  0.   If  after  a  few  steps  of  the 
iteration  are  completed  the  convergence  rate  is  considered 
too  small,  a  may  be  increased  and  the  process  reinitiated. 

This  completes  the  development  and  discussion  of  the  new 
algorithm  for  the  solution  of  a  set  of  nonlinear  equations. 
In  order  to  show  the  power  of  the  iteration  method  (A. 10) 
two  examples,  where  other  techniques  may  fail,  are  given 
below. 

2 .   Experimental  Results 
Example  1 

Find  the  point  of  itersection  of  the  two  curves 


(A. 11) 


y  +  2xy  +  x  =  1 
y  -  xy  +  x   =1 

closest  to  the  given  point  (x  ,y  ) . 

According  to  (A. 6a)  and  (A. 6b)  this  set  of  equations  may  be 

written  as 


(A. 12a) 


[1 

1" 

"x-1 

r  2 

0^ 

xy~ 

"l" 

_0 

lj 

_y_ 

+ 

-i 

i: 

Lx2j 

- 

_1_ 
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or 


ri 

l 

+ 

2    0 

| 

L° 

l 

J-l  1. 

_ 

o  2 


2x  +Ax 
o 


1   1 
0   1 


^Ax 
o  2 

0 


|~Ax 

J  J  Lay 


-V 


2   0 

.0  -ij 


x  y 

X2 

u   o  -' 


(A. 12b) 


Eq.  (A. 12b)  is  now  in  the  desired  form  for  (A. 10). 
Starting  from  the  initial  estimates 


a) 


X 

o 

_ 

0~ 

b) 

X 

o 



~-2~ 

c) 

o 

.8 

,*0- 

_0_ 

' 

*a-> 

1_ 

/ 

*o 

0  _ 

with  no  additional  damping  (A. 10)  yields  the  following 
sequences  for  Ax, 


a) 


b) 


-1.0 

.33 

-.6 

_  2.0 

/ 

1.78^ 

/ 

_2.0_ 

.25 
2.25 


.3975 
1.9  3* 


-.1595 
1.703 


.3993 
1.895 


... 


O 


883 
263 


.516 
-.052 


"  .5993" 
-.1051 


Thus  the  following  solutions  are  obtained 


a) 


r*1 

.  y.J 


-.1595 
1.703 


-%- 
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b) 


O 


X 

1.399 

y_ 

_-.105 

X 

1.399 

Y 

-.105 

The  graphs  of  the  functions  in  (A. 11)  and  the  three  solu- 
tions are  shown  in  Fig.  A.l.   Fig.  A. 2  shows  the  oscilla- 
tion of  the  sequential  values  Ax  for  case  (a).   Fig.  A. 4 
illustrates  the  dependence  of  the  rate  of  convergence  upon 

the  additional  damping  a  where  it  is  assumed  that  the 

-4 
solution  is  obtained  whenever  e  <  10 

Note  that  in  case  (b)  none  of  the  iteration  methods  that 
depend  upon  function  values  could  have  initiated  an  iteration 
and  that  in  case  (c)  other  methods  would  have  converged  to 
a  different  solution  point. 
Example  2 

Find  the  point  of  intersection  of  the  two  curves 


(A. 13a) 


closest  to  the  point  of  (x  ,y  ) 


The  solution 


X' 


-2/7 
+  3 


,  obtained  directly  from  sub- 
stitution is  the  only  finite  solution.   Thus,  no  matter 

what  the  initial  estimate  is,  the  desired  solution  is 

T 
[x*   y*]  .   The  set  of  Eq.  (A. 13)  may  be  rewritten  as 
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ILLUSTRATION    TO  EXAMPLE    I 


Y  +  2XY  +  X   = 

Y  -  XY  +  X2=  2 

X  INITIAL   ESTIMATES 


FIG.    A.I 
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OSCILLATION  OF    AX  AND  AY  -  CASE   A 


FIG.  A.2       EX. 
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CONVERGENCE    AS    A    FUNCTION   OF 
ADDITIONAL    DAMPING  -  EXAMPLE 


i 1 r 

DAMPING   FACTOR 


FIG.     A.3 
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1  1 

1   0 


2 
L2. 


Ax 

^ 

Ax 

"*0+  2  _ 

1 

_Ay_ 

"  1 
-2 

- 

1      l" 

1      0 

X 

o 

- 

~2~ 
2 

x  y 


(A. 13b) 


similar  to  other  iteration  methods,  (A. 9)  will  diverge 

whenever  the  initial  estimate  x   >  0.   While  the  other 

o 

methods  diverge  because  they  iterate  towards  the  solution 


-.5 


,  method  (A. 9)  diverges  because  the  oscillations 
in  Ax  become  increasingly  larger.   This  situation  is  reme- 
died by  using  the  method  of  Eq.  (A. 10)  with  an  additional 
damping  factor  of  approximately  a  =  .7.   Now  the  iteration 
for  Ax  converges  rapidly  so  that  the  desired  solution  is 
obtained  in  only  8  to  9  iteration  steps.   The  solution 
for  the  first  few  iteration  steps,  starting  with  the 


initial  estimate 


X 

1 

o 

— 

lYoj 

L-^J 

as  well  as  the  graphs  of  the 


functions  defined  by  (A. 13a),  are  shown  in  Fig.  A. 4. 

B.   SOLUTION  OF  THE  DYNAMIC  RESPONSE  OF  CIRCUITS  CONTAINING 
NON-LINEAR  RESISTIVE  ELEMENTS* 

The  exact  solution  of  the  dynamic  response  of  a  circuit 

containing  non-linear  resistive  elements  such  as  diodes  or 

transistors  often  presents  problems,  even  when  the  non-linear 

characteristics  are  known,  because  it  may  not  be  possible 

to  solve  for  the  required  variables  explicitly  [6].   The 


The  material  in  this  section  up  to  Eq.  (A. 21)  has  been 
published  in  the  Proceedings  of  the  Second  Annual  Princeton 
Conference  on  Information  Sciences  and  Systems,  1968. 
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LLUSTRATION    TO   EXAMPLE    2 


Y  +  2XY+X  =  I 
2XY  +  X   =  -2 


FIG.    A.4 
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non-linear  resistive  elements  are  considered  as  dependent 

current  (or  voltage)  sources,  dependent  upon  state  or 

other  variables  in  the  network.   The  network  is  then 
characterized  by  the  following  equations. 

x   =  Ax  +  B, u   +  B0i  (A. 14) 

—     —     1— s     2— n 

in  =  N  £-<V     ■  (A. 15) 

vn  =  Rx  +  Sx  us  +  S2  in  (A.  16) 

where    x   =  the  state  variables 

u   =  independent  sources 

i    =  equivalent  current  sources  for  the  non-linear 
elements 

v   =  corresponding  voltages  across  the  non-linear 
elements 

The  matrices  A,  B, ,  B~,  R,  S, ,  and  S~  are  determined  from 
the  network  and  matrix  N  is  defined  from  the  characteristics 
of  the  nonlinear  network  components.   The  roles  of  v   and 
i   in  (A. 14),  (A. 15),  and  (A. 16)  are  interchanged  when 
dependent  voltage  sources  are  used.   Eq.  (A. 14)  is  the  state 
equation  for  the  linear  part  of  the  circuit.   Eq.  (A. 15) 
formulates  the  characteristics  of  the  non-linear  components 
as  given,  for  example,  by  the  Ebers  and  Moll  [20]  equations 
for  transistors.   Eq.  (A. 16)  gives  the  circuit  constraints 
(loop  equations  for  equivalent  current  sources  and  nodal 
equations  for  equivalent  voltage  sources)  between  the 
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voltage  across  the  nonlinear  elements  and  their  currents, 

the  states,  and  the  known  sources. 

For  a  discrete  time  solution  the  sources,  u   and  i  , 

— s     — n 

may  be  approximated  as  piecewise  linear  functions.   Thus 
the  solution  to  (A. 14)  is  given  by 


Sk+1  =  *^k+riBlVk>+r2BlVk+1)+riB2in(k)+r2B2in(k+1) 


where 


(A. 17) 


AT 


-1 


[e y~    (e    "  I)  ] 

-1 


T2  =  A  1[^f-    (eAT    I)    I] 


(A. 18) 
(A. 19) 
(A. 20) 


Substituting  (A. 15)  and  (A. 17)  in  (A. 16)  yields 


V  (k+1)  =  R  x(k)+R   [Bnu  (k)+B9i  (k)] 
— n  —      1   l— s      z— n 


+    (sx+R   2B1)us(k+l)    +    (s2+Rr2B2)N    f[vn(k+l)] 

(A. 21) 
Knowing   the   values    x(k) ,    u    (k+1),    i    (k) ,    Eq.     (A. 21) 
represents    n   simultaneous   non-linear   equations   which   have 
to   be    solved    for   v    (k+1).       Eq.     (A. 21)    may   be   written   in    the 
form 


A'V    (k+1)  +  B 
— n 


*i  (V 


g      (v  ) 
^m     — n 


n 


(A. 22) 
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For  diodes  or  transistors  the  functions  g. (v  )  are  function 

ri  — n 

of  one  variable  only  and  have  the  general  form 

kVi 
gi(vi)  =  (e   X  -  1)  (A. 23) 

Thus  the  method  (A. 10)  is  directly  applicable  and  yields 
with  the  estimate  v  (k)  the  solution  closest  to  this  point 

v  (k+1) .   Using  v  (k+1)  in  (A. 15)  and  (A. 16)  enables  i  (k+1) 

— n  '  — n  — n 

to  be  calculated.   Using  these  as  initial  values  in  (A. 21) 
permits  the  calculation  cycle  to  be  reiterated. 
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